<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="wordpress/2.3.3" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>

<channel>
	<title>corprewland &#187; computer programming</title>
	<link>http://www.corprew.org</link>
	<description>(dis)information organization</description>
	<pubDate>Wed, 09 Jul 2008 00:37:14 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.3.3</generator>
	<language>en</language>
			<item>
		<title>drupal and taxonomy</title>
		<link>http://www.corprew.org/blog/2008/04/17/drupal-and-taxonomy/</link>
		<comments>http://www.corprew.org/blog/2008/04/17/drupal-and-taxonomy/#comments</comments>
		<pubDate>Thu, 17 Apr 2008 07:38:39 +0000</pubDate>
		<dc:creator>corprew</dc:creator>
		
		<category><![CDATA[computer programming]]></category>

		<category><![CDATA[acls]]></category>

		<category><![CDATA[bees]]></category>

		<category><![CDATA[cmses]]></category>

		<category><![CDATA[database hacking]]></category>

		<category><![CDATA[drupal]]></category>

		<category><![CDATA[drupal5]]></category>

		<category><![CDATA[mysql]]></category>

		<category><![CDATA[node-based-cms]]></category>

		<category><![CDATA[open source]]></category>

		<category><![CDATA[organic systems]]></category>

		<category><![CDATA[simple solutions to complex problems]]></category>

		<category><![CDATA[taxonomy]]></category>

		<category><![CDATA[taxos]]></category>

		<guid isPermaLink="false">http://www.corprew.org/blog/2008/04/17/drupal-and-taxonomy/</guid>
		<description><![CDATA[Drupal 5 has a few problems in its security layer, as I&#8217;ve mentioned other places, and some of them stem from the sort of &#8216;it-works-for-me&#8217; philosophy of open source.   This is particularly a problem in a complex system like Drupal, which in most installations is made up of a few dozen modules in [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.drupal.org/">Drupal 5</a> has a few problems in its security layer, as I&#8217;ve mentioned other places, and some of them stem from the sort of &#8216;it-works-for-me&#8217; philosophy of open source.   This is particularly a problem in a complex system like Drupal, which in most installations is made up of a few dozen modules in addition to the core.</p>
<p>The current issue I&#8217;m having is that nodes created by the aggregation module get their taxonomy stripped when they&#8217;re updated because of how another module uses the security functionality, which is just <em>hilarious</em> in a site that&#8217;s largely organized organically by taxonomy. So, after talking with the people I&#8217;m working for on the site, I ended up creating a simple PHP script to run through cron that fixes the issues &#8216;the hard way.&#8217;</p>
<p>If you check out this query&#8230;</p>

<div class="wp_syntax"><div class="code"><pre class="php"><span style="color: #000000; font-weight: bold;">function</span> fix_object<span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$name</span>, <span style="color: #0000ff;">$sqlcon</span><span style="color: #66cc66;">&#41;</span>
<span style="color: #66cc66;">&#123;</span>
  <span style="color: #0000ff;">$query</span> = <span style="color: #ff0000;">&quot;SELECT term_data.name name, term_data.tid termid, node.nid nodeid, node.title title FROM node LEFT JOIN term_node  ON ( term_node.nid = node.nid ) LEFT JOIN term_data ON ( term_data.tid = term_node.tid ) WHERE node.type = 'aggregation_item ' AND node.title LIKE 'Xxxxx &quot;</span> . <span style="color: #0000ff;">$name</span> . <span style="color: #ff0000;">&quot;%'&quot;</span>;
&nbsp;
  <span style="color: #808080; font-style: italic;">// Perform Query</span>
  <span style="color: #0000ff;">$result</span> = <span style="color: #000066;">mysql_query</span><span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$query</span><span style="color: #66cc66;">&#41;</span>;
 <span style="color: #808080; font-style: italic;">// ... and so on...</span></pre></div></div>

<p>You can see that this is a fairly normal sql query that looks for all the nodes of type aggregation_item and titled a particular pattern.  Because of the way the joins are structured, that means that any nodes that have lost their taxonomies will have NULL for termname and termid.  Those nodeids with NULL termids can then have the proper taxonomy entries stuffed back into them&#8230;</p>

<div class="wp_syntax"><div class="code"><pre class="php"><span style="color: #000000; font-weight: bold;">function</span> insert_taxo_4_node<span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$node_id</span>, <span style="color: #0000ff;">$taxo_id</span>, <span style="color: #0000ff;">$con</span><span style="color: #66cc66;">&#41;</span>
<span style="color: #66cc66;">&#123;</span>
  <span style="color: #0000ff;">$query</span> = <span style="color: #ff0000;">&quot;INSERT INTO term_node (nid, tid) VALUES (&quot;</span>. <span style="color: #0000ff;">$node_id</span> . <span style="color: #ff0000;">&quot;,&quot;</span> . <span style="color: #0000ff;">$taxo_id</span> . <span style="color: #ff0000;">&quot;)&quot;</span>;
&nbsp;
  <span style="color: #0000ff;">$result</span> = <span style="color: #000066;">mysql_query</span><span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$query</span><span style="color: #66cc66;">&#41;</span>;
  <span style="color: #808080; font-style: italic;">// Check result</span>
  <span style="color: #808080; font-style: italic;">// This shows the actual query sent to MySQL, and the error. Useful for debugging.</span>
  <span style="color: #b1b100;">if</span> <span style="color: #66cc66;">&#40;</span>!<span style="color: #0000ff;">$result</span><span style="color: #66cc66;">&#41;</span> 
    <span style="color: #66cc66;">&#123;</span>
      <span style="color: #0000ff;">$message</span>  = <span style="color: #ff0000;">'Invalid query: '</span> . <span style="color: #000066;">mysql_error</span><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span> . <span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span>;
      <span style="color: #0000ff;">$message</span> .= <span style="color: #ff0000;">'Whole query: '</span> . <span style="color: #0000ff;">$query</span>;
      <span style="color: #000066;">die</span><span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$message</span><span style="color: #66cc66;">&#41;</span>;
    <span style="color: #66cc66;">&#125;</span>
<span style="color: #66cc66;">&#125;</span></pre></div></div>

<p>I&#8217;m largely posting this up in case people run into the same problem &#8212; this is a hilariously simple fix for a difficult to fix problem in drupal, but it&#8217;s a generic information architecture issue of what to do when the system that you&#8217;re working on is unreliable.  I should probably mention that the issues with security in drupal aren&#8217;t related to authentication, but instead are related to item ACLs denying access to things for strange reasons, and are not crucial security bugs in the OMG MUST PATCH NOW sense.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.corprew.org/blog/2008/04/17/drupal-and-taxonomy/feed/</wfw:commentRss>
		</item>
		<item>
		<title>ruby/rails and the invited beta</title>
		<link>http://www.corprew.org/blog/2008/04/16/rubyrails-and-the-invited-beta/</link>
		<comments>http://www.corprew.org/blog/2008/04/16/rubyrails-and-the-invited-beta/#comments</comments>
		<pubDate>Thu, 17 Apr 2008 06:31:59 +0000</pubDate>
		<dc:creator>corprew</dc:creator>
		
		<category><![CDATA[computer programming]]></category>

		<category><![CDATA[account]]></category>

		<category><![CDATA[acts_as_authenticated]]></category>

		<category><![CDATA[facebook]]></category>

		<category><![CDATA[guid]]></category>

		<category><![CDATA[invite]]></category>

		<category><![CDATA[invite functionality]]></category>

		<category><![CDATA[private beta]]></category>

		<category><![CDATA[programming]]></category>

		<category><![CDATA[rails]]></category>

		<category><![CDATA[ror]]></category>

		<category><![CDATA[ruby]]></category>

		<category><![CDATA[slicehost]]></category>

		<category><![CDATA[tristero]]></category>

		<category><![CDATA[welcome screen]]></category>

		<guid isPermaLink="false">http://www.corprew.org/blog/2008/04/16/rubyrails-and-the-invited-beta/</guid>
		<description><![CDATA[I&#8217;ve been working on a website in RoR for the last while, and it&#8217;s about to go live in the private beta sort of way that seems to be so popular these days.  It&#8217;s handy that way, because that way I can set up the site at slicehost or similar and not have to [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been working on a website in RoR for the last while, and it&#8217;s about to go live in the private beta sort of way that seems to be so popular these days.  It&#8217;s handy that way, because that way I can set up the site at slicehost or similar and not have to worry (too much) about my server slowing from getting overloaded.  This same site&#8217;s next incarnation is going to be facebook related, so that should overwhelm any sense of moderation (if I&#8217;m lucky.)</p>
<p>So, the key method of invitation to a private beta is that you mail someone a code allowing them access to the system, for these purposes, let&#8217;s just assume that the code is some reasonably long unique string (in my code, it&#8217;s actually a <code>uuid</code>.)  So, set up a migration something like this to manage them:</p>

<div class="wp_syntax"><div class="code"><pre class="ruby">  <span style="color:#9966CC; font-weight:bold;">def</span> <span style="color:#0000FF; font-weight:bold;">self</span>.<span style="color:#9900CC;">up</span>
    create_table <span style="color:#ff3333; font-weight:bold;">:invites</span> <span style="color:#9966CC; font-weight:bold;">do</span> |t|
<span style="color:#008000; font-style:italic;"># deleted stuff</span>
      t.<span style="color:#CC0066; font-weight:bold;">string</span> <span style="color:#ff3333; font-weight:bold;">:guid</span>
      t.<span style="color:#CC0066; font-weight:bold;">integer</span> <span style="color:#ff3333; font-weight:bold;">:used_yet</span>
<span style="color:#008000; font-style:italic;"># deleted stuff</span>
    <span style="color:#9966CC; font-weight:bold;">end</span>
  <span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div>

<p><code>used_yet</code> isn&#8217;t a boolean for reasons that are too laborious to go into here, but reflect some functionality in the code that I&#8217;m not going to display.  Assuming that you&#8217;re using acts_as_authentication and are redirecting anyone who tries to access your app to the default welcome page according to the usual methods, set up something like this in your routes.rb:</p>

<div class="wp_syntax"><div class="code"><pre class="ruby">  map.<span style="color:#9900CC;">root</span> <span style="color:#ff3333; font-weight:bold;">:controller</span> =&gt; <span style="color:#996600;">&quot;welcome&quot;</span></pre></div></div>

<p>This is probably the case in like half the rails apps out there.  Have the <code>index</code> method of the welcome controller put up a form with a field like:</p>

<div class="wp_syntax"><div class="code"><pre class="ruby"><span style="color:#008000; font-style:italic;">#let's see if the formatter can handle rails erb without exploding.</span>
&lt; % form_tag<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">'welcome/checkinvite'</span>, <span style="color:#ff3333; font-weight:bold;">:method</span>=&gt;:get<span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#9966CC; font-weight:bold;">do</span> -%&gt;
  &lt; %= text_field_tag <span style="color:#996600;">'invite'</span> %&gt;
  &lt; %= submit_tag <span style="color:#996600;">'begin'</span> %&gt;
&lt; % <span style="color:#9966CC; font-weight:bold;">end</span> -%&gt;</pre></div></div>

<p>This lets the user enter their invite in more or less the normal method.  Now in your &#8216;welcome&#8217; controller, you&#8217;ll need a &#8216;checkinvite&#8217; method that looks something like the following:</p>

<div class="wp_syntax"><div class="code"><pre class="ruby"> <span style="color:#9966CC; font-weight:bold;">def</span> checkinvite
    <span style="color:#0066ff; font-weight:bold;">@inviteguid</span> = params<span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#ff3333; font-weight:bold;">:invite</span><span style="color:#006600; font-weight:bold;">&#93;</span>
    <span style="color:#0066ff; font-weight:bold;">@invite</span> = Invite.<span style="color:#9900CC;">find_by_guid</span><span style="color:#006600; font-weight:bold;">&#40;</span>@inviteguid<span style="color:#006600; font-weight:bold;">&#41;</span>
    <span style="color:#9966CC; font-weight:bold;">if</span><span style="color:#006600; font-weight:bold;">&#40;</span>@invite == <span style="color:#0000FF; font-weight:bold;">nil</span><span style="color:#006600; font-weight:bold;">&#41;</span>
      flash<span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#ff3333; font-weight:bold;">:notice</span><span style="color:#006600; font-weight:bold;">&#93;</span> = <span style="color:#996600;">&quot;Your invite was invalid&quot;</span>
      redirect_to root_url
      <span style="color:#0000FF; font-weight:bold;">return</span>
    <span style="color:#9966CC; font-weight:bold;">end</span>
    <span style="color:#9966CC; font-weight:bold;">if</span><span style="color:#006600; font-weight:bold;">&#40;</span>@invite.<span style="color:#9900CC;">used_yet</span> == <span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#41;</span>
      flash<span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#ff3333; font-weight:bold;">:notice</span><span style="color:#006600; font-weight:bold;">&#93;</span> = <span style="color:#996600;">&quot;Your invite had already been used&quot;</span>
      redirect_to root_url
      <span style="color:#0000FF; font-weight:bold;">return</span>
    <span style="color:#9966CC; font-weight:bold;">end</span>
  <span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div>

<p>After this, you&#8217;ll need to have some code in your HTML page that links you to the account/signup functionality of acts_as_authenticated.  I&#8217;m not going to include that because I&#8217;m too lazy to fish it out of my app functionality, but you can do that pretty much with a link_to using <code>:invite=%gt;@invite_guid</code> as an extra parameter.</p>
<p>You need to put the same invite detection code in account/signup, and then when you&#8217;ve created the account, set <code>invite.used_yet = 1</code>.  This is about as simple as a method that I can think of for doing the private beta functionality that seems to be so much in vogue these days.  Enjoy.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.corprew.org/blog/2008/04/16/rubyrails-and-the-invited-beta/feed/</wfw:commentRss>
		</item>
		<item>
		<title>iPhone Development</title>
		<link>http://www.corprew.org/blog/2008/04/08/iphone-development/</link>
		<comments>http://www.corprew.org/blog/2008/04/08/iphone-development/#comments</comments>
		<pubDate>Wed, 09 Apr 2008 00:47:19 +0000</pubDate>
		<dc:creator>corprew</dc:creator>
		
		<category><![CDATA[computer programming]]></category>

		<category><![CDATA[beta]]></category>

		<category><![CDATA[bloggeroutragesyndrome]]></category>

		<category><![CDATA[iphone]]></category>

		<category><![CDATA[iphone dev]]></category>

		<guid isPermaLink="false">http://www.corprew.org/blog/2008/04/08/iphone-development/</guid>
		<description><![CDATA[There have been a lot of people asking angry questions to Apple today because the Apple &#946; that they gave out to iPhone developers was timed to expire today and a lot of devs now have bricked their main mobile phone until an update appears.  Lots of people appear angry, but they&#8217;re missing the [...]]]></description>
			<content:encoded><![CDATA[<p>There have been a lot of people asking angry questions to Apple today because the Apple &beta; that they gave out to iPhone developers was timed to expire today and a lot of devs now have bricked their main mobile phone until an update appears.  Lots of people appear angry, but they&#8217;re missing the main issue for Apple:</p>
<blockquote><p><em>Dear Apple, why are you letting people this stupid into your &beta; programs</em></p></blockquote>
<p>People frequently forget what beta for software means in these days where everything is &beta; until people find a way to make money off of it.  It means untested, believed working properly but may blow up at any time, not ready for production.  So, I&#8217;m halfway between bemused and annoyed at the <a href="http://it.slashdot.org/it/08/04/08/1932232.shtml">outrage</a> that some folks seem to be fielding on various <a href="http://discussions.apple.com/thread.jspa?threadID=1476975&#038;tstart=0">fora</a>.</p>
<p>Also, calling a phone &#8216;bricked&#8217; when you can easily recover it by downloading new software hours later is hitting the epistemological puff pastry with a hammer.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.corprew.org/blog/2008/04/08/iphone-development/feed/</wfw:commentRss>
		</item>
		<item>
		<title>unix: year 19</title>
		<link>http://www.corprew.org/blog/2008/04/03/unix-year-19/</link>
		<comments>http://www.corprew.org/blog/2008/04/03/unix-year-19/#comments</comments>
		<pubDate>Thu, 03 Apr 2008 17:39:39 +0000</pubDate>
		<dc:creator>corprew</dc:creator>
		
		<category><![CDATA[computer programming]]></category>

		<category><![CDATA[linux]]></category>

		<category><![CDATA[security]]></category>

		<category><![CDATA[sudo]]></category>

		<category><![CDATA[unix]]></category>

		<guid isPermaLink="false">http://www.corprew.org/blog/2008/04/03/unix-year-19/</guid>
		<description><![CDATA[My first access to a unix machine was around 19 years ago, and I&#8217;m still amazined that sudo tcsh is a valid command on most systems.
I&#8217;m not saying that it isn&#8217;t convenient, mind you, but the fact that I can then execute emacs is also hilarious.  Especially because sudo emacs is prohibited.
here is your [...]]]></description>
			<content:encoded><![CDATA[<p>My first access to a unix machine was around 19 years ago, and I&#8217;m <em>still</em> amazined that <code>sudo tcsh</code> is a valid command on most systems.</p>
<p>I&#8217;m not saying that it isn&#8217;t convenient, mind you, but the fact that I can then execute <code>emacs</code> is also hilarious.  Especially because <code>sudo emacs</code> is prohibited.</p>
<p><em>here is your system log, let me save you the trouble of auditing it by running a shell</em>.</p>
<p>(I&#8217;m aware, incidentally, that it&#8217;s basically impossible to stop people from running a shell as long as they can run any naive-turing-complete interpreter or compiler.  Maybe it&#8217;s time to only fight battles you can win.)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.corprew.org/blog/2008/04/03/unix-year-19/feed/</wfw:commentRss>
		</item>
		<item>
		<title>writing code on the macintosh and its derivatives</title>
		<link>http://www.corprew.org/blog/2008/03/12/writing-code-on-the-macintosh-and-its-derivatives/</link>
		<comments>http://www.corprew.org/blog/2008/03/12/writing-code-on-the-macintosh-and-its-derivatives/#comments</comments>
		<pubDate>Thu, 13 Mar 2008 02:10:20 +0000</pubDate>
		<dc:creator>corprew</dc:creator>
		
		<category><![CDATA[computer programming]]></category>

		<category><![CDATA[CoreData]]></category>

		<category><![CDATA[iphone]]></category>

		<category><![CDATA[iphone process model]]></category>

		<category><![CDATA[macosx]]></category>

		<category><![CDATA[ManagedObjectContext]]></category>

		<category><![CDATA[objc]]></category>

		<category><![CDATA[objective c]]></category>

		<category><![CDATA[ogm]]></category>

		<category><![CDATA[orm]]></category>

		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://www.corprew.org/blog/2008/03/12/writing-code-on-the-macintosh-and-its-derivatives/</guid>
		<description><![CDATA[note that the latest revision of this blog&#8217;s theme seems to have introduced a weird bug with the code layout plugin (wp-syntax) on some browsers.  i&#8217;m looking into it.
I think the single most useful thing I&#8217;ve figured out recently in programming for MacOSX and the iPhone is this little snippet right here.

- &#40;void&#41; updateListForEntityNamed:&#40;NSString*&#41; [...]]]></description>
			<content:encoded><![CDATA[<p><b>note that the latest revision of this blog&#8217;s theme seems to have introduced a weird bug with the code layout plugin (wp-syntax) on some browsers.  i&#8217;m looking into it.</b></p>
<p>I think the single most useful thing I&#8217;ve figured out recently in programming for MacOSX and the iPhone is this little snippet right here.</p>

<div class="wp_syntax"><div class="code"><pre class="objc">- <span style="color: #002200;">&#40;</span><span style="color: #0000ff;">void</span><span style="color: #002200;">&#41;</span> updateListForEntityNamed:<span style="color: #002200;">&#40;</span><span style="color: #0000ff;">NSString</span>*<span style="color: #002200;">&#41;</span> entityName andSearchString:<span style="color: #002200;">&#40;</span><span style="color: #0000ff;">NSString</span>*<span style="color: #002200;">&#41;</span> queryString
<span style="color: #002200;">&#123;</span>
<span style="color: #002200;">&#91;</span>...<span style="color: #002200;">&#93;</span>
&nbsp;
	MyDocument* current = <span style="color: #002200;">&#91;</span><span style="color: #002200;">&#91;</span><span style="color: #0000ff;">NSDocumentController</span> sharedDocumentController<span style="color: #002200;">&#93;</span> currentDocument<span style="color: #002200;">&#93;</span>;
	<span style="color: #0000ff;">if</span><span style="color: #002200;">&#40;</span>current &amp;&amp; current != self<span style="color: #002200;">&#41;</span>
	<span style="color: #002200;">&#123;</span>
		NSLog<span style="color: #002200;">&#40;</span>@<span style="color: #666666;">&quot;CurrentDocument:%@ != self:%@&quot;</span>, current, self<span style="color: #002200;">&#41;</span>;
		<span style="color: #002200;">&#91;</span>current updateListForEntityNamed: entityName andSearchString: queryString<span style="color: #002200;">&#93;</span>;
		<span style="color: #0000ff;">return</span>;
	<span style="color: #002200;">&#125;</span>
<span style="color: #002200;">&#91;</span>...<span style="color: #002200;">&#93;</span>
<span style="color: #002200;">&#125;</span></pre></div></div>

<p>What this does is intercept incoming messages that are supposed to go to the current window, and redirect them to that instance.  I&#8217;ve run into issues in Leopard (MacOSX 10.5) where this is an issue.  To some extent, this is probably a misconfiguration in interface builder somewhere, but it also an issue when using CoreData, because the <code>ManagedObjectContext</code>s are particular to instances of <code>NSManagedDocument</code>, and there are issues that arise if you end up using the wrong context.</p>
<p>I am slowly becoming a great fan of CoreData, it&#8217;s a great <a href="http://developer.apple.com/macosx/coredata.html">persistence/object-graph-management layer</a>.  More on this later.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.corprew.org/blog/2008/03/12/writing-code-on-the-macintosh-and-its-derivatives/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Classifiers and Classification</title>
		<link>http://www.corprew.org/blog/2007/06/27/classifiers-and-classification/</link>
		<comments>http://www.corprew.org/blog/2007/06/27/classifiers-and-classification/#comments</comments>
		<pubDate>Wed, 27 Jun 2007 22:49:04 +0000</pubDate>
		<dc:creator>corprew</dc:creator>
		
		<category><![CDATA[classification]]></category>

		<category><![CDATA[computer programming]]></category>

		<guid isPermaLink="false">http://www.corprew.org/2007/06/27/classifiers-and-classification/</guid>
		<description><![CDATA[For the last while, I&#8217;ve been working on a project that involves scanning large numbers of RSS/Atom feeds, and then using Bayesian1 classifiers to break it into one of a number of categories for summarization and display (the system that I&#8217;m using to do this is available as a sample website, but really needs more [...]]]></description>
			<content:encoded><![CDATA[<p>For the last while, I&#8217;ve been working on a project that involves scanning large numbers of RSS/Atom feeds, and then using Bayesian<sup>1</sup> classifiers to break it into one of a number of categories for summarization and display (the system that I&#8217;m using to do this is available as a sample website, but really needs more data in the training sets before it&#8217;s ready to entertain all of you.)  The categories are pretty straightforward, and they fit into a somewhat neat controlled vocabulary (ontology/thesaurus/whatever.)</p>
<p>There&#8217;s a relation, though, between the different terms in this sort of classification and the training data used to build the Bayesian Classifier.  If the terms are arranged in a hierarchy (and certain assumptions are made about that hierarchy, like subterms encompassing part of the range of meaning of their parent term and nothing else)<sup>2</sup>, then the training data used for classifying terms can be shared.</p>
<p>For example, all <strong>positive</strong> training data that belongs to the child terms can also be used for the parent.  So, for (a constructed) example, positive training data for <em>tamiflu</em> also belongs in the positive data for <em>bird flu vaccines</em>.  The reverse is true of <strong>negative</strong> training data.  For negative data, the negative data for the parent can also be used for the child terms.</p>
<p>This is highly useful information when you&#8217;re making a large scale text classifier (and having it classify texts as belonging to categories or not, as opposed to just clustering texts into the categories that actually appear.  It&#8217;s easier to use things like bayesian classifiers do to this if you&#8217;re looking for somewhat fine-grained detail.</p>
<p>Currently, I&#8217;ve been using <a href="http://classifier4j.sourceforge.net/">Classifier4J</a> for doing the classification and text summarization<sup>3</sup>.  The text summarization is sort of annoying, though, because it&#8217;s based on a simple statistical choice of sentences which occasionally picks up date-lines and partial phrases because of what&#8217;s &#8216;important.&#8217;  I&#8217;m resorting the urge to go completely POS-tagging nuts on the whole thing and only selecting sentences of certain types or completeness because this is, after all, a side project.  (The number of times I see things like &#8216;this sentence no verb.&#8217; is astounding, though, and slowly driving me nuts.)</p>
<p>So, another day in the life.</p>
<p><sup>1</sup> although i&#8217;m also using a vector space classifier for a related, larger project and it&#8217;s driving me less nuts training it.<br />
<sup>2</sup> this is called a meronymous (&#8217;part-of&#8217;) relationship, and given that half the people who regularly read this blog were in LIS530 or its equivalent at some point, you should remember this.<br />
<sup>3</sup> and will probably eventually switch to jNBC http://jbnc.sourceforge.net/ before i go nuts</p>
]]></content:encoded>
			<wfw:commentRss>http://www.corprew.org/blog/2007/06/27/classifiers-and-classification/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Long Live Locks! (part 1)</title>
		<link>http://www.corprew.org/blog/2007/05/07/long-live-locks-part-1/</link>
		<comments>http://www.corprew.org/blog/2007/05/07/long-live-locks-part-1/#comments</comments>
		<pubDate>Mon, 07 May 2007 19:56:38 +0000</pubDate>
		<dc:creator>corprew</dc:creator>
		
		<category><![CDATA[computer programming]]></category>

		<category><![CDATA[general]]></category>

		<guid isPermaLink="false">http://www.corprew.org/2007/05/07/long-live-locks-part-1/</guid>
		<description><![CDATA[So, I was talking to someone today about their application (which was Ruby on Rails-based), and we had a long conversation about locking.  There&#8217;s a couple of different sorts of locks that show up in software development, but there&#8217;s one in particular that mostly only shows up in enterprise software development, the Long-lived Lock.
Locks [...]]]></description>
			<content:encoded><![CDATA[<p>So, I was talking to someone today about their application (which was Ruby on Rails-based), and we had a long conversation about locking.  There&#8217;s a couple of different sorts of locks that show up in software development, but there&#8217;s one in particular that mostly only shows up in enterprise software development, the <em>Long-lived Lock</em>.</p>
<p>Locks are used to keep other processes from modifying resources in the system.  These can show up at a variety of levels ranging from Critical Sections (<a href="http://www.javaworld.com/javaworld/jw-04-1999/jw-04-toolbox_p.html">Java</a> / <a href="http://msdn2.microsoft.com/en-us/library/ms682530.aspx">Win</a> ) that synchronize access to particular pieces of code, to database locks, which keep people from reading from or writing to rows or tables while operations are done.</p>
<p>However, all of these operations are for short periods of time.  You can&#8217;t keep a read or write lock on a row in a database for an extended period of time (or in cases where you can, you almost certainly <em>shouldn&#8217;t.</em>.)  About the longest time a row in a database should be locked is to perform a single transaction (which may be spread between multiple databases, rows, or what have you, but the time is just the changes for the transaction, not all the time that people spend staring at a screen and enterting data before hitting the return key.)</p>
<p>But how do you let a user lock information for an extended period of time?  For example, say the user is locking a row in the database that represents a document that they&#8217;re updating (a frequent setup in most ECM/DM systems.)  Well, since that&#8217;s part of the ECM system, that should happen inside the logic of that application.  It shouldn&#8217;t be achieved through database locking, but should instead be stored as information within the database.</p>
<p>It&#8217;s possible to set this up a number of different ways, but lets assume you have a document table <code>document</code> and it has, by convention, an <code>id</code> column that represents the primary key on the table.  I&#8217;m also going to make the assumption that writing to a document is done by a particular user.  Your application&#8217;s security system may vary.</p>
<p>So, let&#8217;s look at a table set up for locking on the <code>document</code> table:</p>
<p><code>TABLE doc_lock<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;document_id : INTEGER<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;user_id : INTEGER<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;lock_expires: DATETIME<br />
END</code></p>
<p>And you just join this table in when you need to know if there are locks on a particular object, and you otherwise create and delete locks as needed.  One particular thing about this sort of locking strategy is that you end up with expired locks accumulating on documents, so you want to clean those up, and also when you join in the lock table you want to have non-expired locks only.</p>
<p>Your app needs behavior about various things to surround this, like what&#8217;s the security model surrounding locks (who can know about them, are they on a user/group/role basis, etc&#8230;), and when can a lock be broken.  Sooner or later, you&#8217;ll need to break locks, like for an employee on vacation who&#8217;s got documents locked or similar.  But that&#8217;s all above the database structure and the immediate operations on the lock table, which I&#8217;m discussing here.</p>
<p>Well, that&#8217;s part one of three.  The next segment will be the Ruby-on-Rails implementation I sketched out for my interlocutor, and the last will be some variations on and exceptions to this idea.  I consider long-lived locks a design pattern, because it&#8217;s a recurring pattern in enterprise computing.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.corprew.org/blog/2007/05/07/long-live-locks-part-1/feed/</wfw:commentRss>
		</item>
		<item>
		<title>TCSH and Ruby on Rails for Macintosh</title>
		<link>http://www.corprew.org/blog/2007/04/04/tcsh-and-ruby-on-rails-for-macintosh/</link>
		<comments>http://www.corprew.org/blog/2007/04/04/tcsh-and-ruby-on-rails-for-macintosh/#comments</comments>
		<pubDate>Wed, 04 Apr 2007 23:50:51 +0000</pubDate>
		<dc:creator>corprew</dc:creator>
		
		<category><![CDATA[computer programming]]></category>

		<guid isPermaLink="false">http://www.corprew.org/2007/04/04/tcsh-and-ruby-on-rails-for-macintosh/</guid>
		<description><![CDATA[Some comments on Hivelogic - The Narrative - Building Ruby, Rails, Subversion, Mongrel, and MySQL on Mac OS X
I&#8217;ve been using this set of instructions to install ruby on rails on MacOSX for a while (in case you&#8217;ve ever wondered, which you haven&#8217;t, I use a MacBook Pro set up to run Windows XP and [...]]]></description>
			<content:encoded><![CDATA[<p>Some comments on <a href="http://hivelogic.com/narrative/articles/ruby-rails-mongrel-mysql-osx">Hivelogic - The Narrative - Building Ruby, Rails, Subversion, Mongrel, and MySQL on Mac OS X</a></p>
<p>I&#8217;ve been using this set of instructions to install ruby on rails on MacOSX for a while (in case you&#8217;ve ever wondered, which you haven&#8217;t, I use a MacBook Pro set up to run Windows XP and MacOSX 1.4.x.)  It doesn&#8217;t work well for me, because I use &#8216;tcsh&#8217; and not &#8216;bash&#8217; as my shell on the computer.  I also like confining changes to my own account.</p>
<p>So, I use the instructions given in the cited article, with the following difference.</p>
<p><b>Paths</b><br />
Here, add the following line to the end of your .cshrc</p>
<blockquote><p><code>setenv PATH /usr/local/bin:/usr/local/sbin:/usr/local/mysql/bin:/sw/bin:$PATH</code></p></blockquote>
<p>(This is all just one long line)</p>
<p>For the rest, I replace all instances of &#8217;sudo command&#8217; with &#8217;sudo tcsh&#8217; followed by the command.  More concretely, instead of:</p>
<blockquote><p><code>curl -O ftp://ftp.ruby-lang.org/pub/ruby/1.8/ruby-1.8.6.tar.gz<br />
tar xzvf ruby-1.8.6.tar.gz<br />
cd ruby-1.8.6<br />
./configure --prefix=/usr/local --enable-pthread --with-readline-dir=/usr/local<br />
make<br />
sudo make install<br />
sudo make install-doc<br />
cd ..</code></p></blockquote>
<p>I do:</p>
<blockquote><p><code>curl -O ftp://ftp.ruby-lang.org/pub/ruby/1.8/ruby-1.8.6.tar.gz<br />
tar xzvf ruby-1.8.6.tar.gz<br />
cd ruby-1.8.6<br />
./configure --prefix=/usr/local --enable-pthread --with-readline-dir=/usr/local<br />
make<br />
sudo tcsh<br />
make install<br />
make install-doc<br />
exit<br />
cd ..</code></p></blockquote>
<p>This has the advantage of keeping my root environment clean and running bash, which have been disadvantages to the other solutions I&#8217;ve seen for this sort of thing.  There&#8217;s a related issue of whether you should be able to sudo a shell, but that&#8217;s not the point of this article to argue about &#8212; this article is about making sure you have the right environment variables when you type &#8216;make install,&#8217; basically.</p>
<p>I haven&#8217;t provided exact conversions of all the sets of commands because if you can&#8217;t figure the rest out, you might want to switch your account shell back to bash to avoid more trouble later.  In particular, you will want to execute the &#8216;rehash&#8217; shell command on occasion.</p>
<blockquote><p><code>[beansidhe:~/ruby-1.8.6] zeitgeis% ruby -v<br />
ruby 1.8.2 (2004-12-25) [universal-darwin8.0]<br />
[beansidhe:~/ruby-1.8.6] zeitgeis% rehash<br />
[beansidhe:~/ruby-1.8.6] zeitgeis% ruby -v<br />
ruby 1.8.6 (2007-03-13 patchlevel 0) [i686-darwin8.9.1]</code></p></blockquote>
<p>&#8216;rehash&#8217; causes the shell to recreate the cached path, which is handy when you&#8217;re adding new executables outside the current directory.</p>
<p>Technorati Tags: <a class="performancingtags" href="http://technorati.com/tag/rubyonrails" rel="tag">rubyonrails</a>, <a class="performancingtags" href="http://technorati.com/tag/ruby" rel="tag">ruby</a>, <a class="performancingtags" href="http://technorati.com/tag/macosx" rel="tag">macosx</a>, <a class="performancingtags" href="http://technorati.com/tag/development" rel="tag">development</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.corprew.org/blog/2007/04/04/tcsh-and-ruby-on-rails-for-macintosh/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Tagging / Taxonomy features in Sharepoint (MSFT)</title>
		<link>http://www.corprew.org/blog/2007/01/22/tagging-taxonomy-features-in-sharepoint-msft/</link>
		<comments>http://www.corprew.org/blog/2007/01/22/tagging-taxonomy-features-in-sharepoint-msft/#comments</comments>
		<pubDate>Tue, 23 Jan 2007 00:47:43 +0000</pubDate>
		<dc:creator>corprew</dc:creator>
		
		<category><![CDATA[classification]]></category>

		<category><![CDATA[computer programming]]></category>

		<category><![CDATA[infosci]]></category>

		<guid isPermaLink="false">http://www.corprew.org/2007/01/22/tagging-taxonomy-features-in-sharepoint-msft/</guid>
		<description><![CDATA[Enterprise Content Management (ECM) Team Blog : Taxonomy/Tagging Starter Kit for SharePoint Server, also at the Sharepoint blog
Microsoft has made a kit available for Sharepoint that makes it easier to have taxonomy and tagging.&#160; The tagging allows authors to tag items and to also have controlled vocabularies on particular multi-valued properties.&#160; Users can incorporate the [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://blogs.msdn.com/ecm/archive/2007/01/22/taxonomy-tagging-starter-kit-for-sharepoint-server.aspx">Enterprise Content Management (ECM) Team Blog : Taxonomy/Tagging Starter Kit for SharePoint Server</a>, also at the <a href="http://blogs.msdn.com/sharepoint/archive/2007/01/22/taxonomy-tagging-starter-kit-for-sharepoint-server.aspx">Sharepoint blog</a></p>
<p>Microsoft has made a kit available for Sharepoint that makes it easier to have taxonomy and tagging.&nbsp; The tagging allows authors to tag items and to also have controlled vocabularies on particular multi-valued properties.&nbsp; Users can incorporate the controlled vocabularies into searches and also search by tags.&nbsp; </p>
<p>In the default configuration, users cannot tag items on the fly (although I suspect that they could change taxonomy values if they have permissions.)</p>
<p>I used to work (engineering) at an <a href="http://www.filenet.com">ECM</a> company, so using the phrase &#8216;controlled vocabulary&#8217; in place of taxonomy for this is somewhat second nature.&nbsp; Since I took a lot of classification classes at the Information School, it&#8217;s interesting to see how companies implement these concepts.&nbsp; It could be interesting if these features became widely available in Sharepoint.</p>
<p>Technorati Tags: <a class="performancingtags" href="http://technorati.com/tag/msft" rel="tag">msft</a>, <a class="performancingtags" href="http://technorati.com/tag/sharepoint" rel="tag">sharepoint</a>, <a class="performancingtags" href="http://technorati.com/tag/controlled%20vocabulary" rel="tag">controlled vocabulary</a>, <a class="performancingtags" href="http://technorati.com/tag/taxonomy" rel="tag">taxonomy</a>, <a class="performancingtags" href="http://technorati.com/tag/tagging" rel="tag">tagging</a>, <a class="performancingtags" href="http://technorati.com/tag/ecm" rel="tag">ecm</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.corprew.org/blog/2007/01/22/tagging-taxonomy-features-in-sharepoint-msft/feed/</wfw:commentRss>
		</item>
		<item>
		<title>whydeargodwhy?</title>
		<link>http://www.corprew.org/blog/2006/11/17/why-deargod-why/</link>
		<comments>http://www.corprew.org/blog/2006/11/17/why-deargod-why/#comments</comments>
		<pubDate>Fri, 17 Nov 2006 23:05:46 +0000</pubDate>
		<dc:creator>corprew</dc:creator>
		
		<category><![CDATA[computer programming]]></category>

		<guid isPermaLink="false">http://www.corprew.org/2006/11/17/why-deargod-why/</guid>
		<description><![CDATA[Why is releasing in codes with TODOs and FIXMEs in it &#8216;The Ruby Way?&#8217;
Technorati Tags: ruby
]]></description>
			<content:encoded><![CDATA[<p>Why is releasing in codes with TODOs and FIXMEs in it &#8216;The Ruby Way?&#8217;</p>
<p>Technorati Tags: <a href="http://www.technorati.com/tag/ruby" rel="tag">ruby</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.corprew.org/blog/2006/11/17/why-deargod-why/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.652 seconds -->
