<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Matt_ptr * &#187; Thoughts</title>
	<atom:link href="http://mattptr.net/category/thoughts/feed/" rel="self" type="application/rss+xml" />
	<link>http://mattptr.net</link>
	<description>Programming and stuff -- incoherent and unfocused since 1997</description>
	<lastBuildDate>Mon, 27 Feb 2012 00:02:38 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.4</generator>
		<item>
		<title>Another 6 months of collected thoughts</title>
		<link>http://mattptr.net/2011/12/22/another-6-months-of-collected-thoughts/</link>
		<comments>http://mattptr.net/2011/12/22/another-6-months-of-collected-thoughts/#comments</comments>
		<pubDate>Thu, 22 Dec 2011 20:26:03 +0000</pubDate>
		<dc:creator>matt</dc:creator>
				<category><![CDATA[Thoughts]]></category>

		<guid isPermaLink="false">http://mattptr.net/?p=248</guid>
		<description><![CDATA[Wow. Here I was thinking I&#8217;d have time to update this site more, but then I suddenly became busy at work somehow. Anyway, here we go with the thoughts: Anniversary This site turned 14 on Dec 4th. I swear I didn&#8217;t forget this year. I was going to post on the 5th, but I couldn&#8217;t [...]]]></description>
			<content:encoded><![CDATA[<p>Wow. Here I was thinking I&#8217;d have time to update this site more, but then I suddenly became busy at work somehow. Anyway, here we go with the thoughts:</p>
<h3>Anniversary</h3>
<p>This site turned 14 on Dec 4th. I swear I didn&#8217;t forget this year. I was going to post on the 5th, but I couldn&#8217;t think of anything besides, &#8220;The site is 14. Yay.&#8221; &#8230; and I still can&#8217;t think of anything better. So&#8230; yay.</p>
<h3>Drupal</h3>
<p>I no longer have confidence in Drupal. Not that I ever had much, but after spending 6 months working with 5 or 6 Drupal 7 sites, I don&#8217;t know why anyone uses it or continues to develop it. There are a few reasons to list, but the long and short of it is the database.</p>
<p>Drupal&#8217;s &#8220;Field&#8221; feature is important and necessary for nearly every project, and I use it quite a bit. On one project, I had at least 30 unique fields for different content types. The problem is that for each field, two database tables are created. Yes, not just one table, but two.</p>
<p>So my project that has 30 unique fields now has 60 extra tables. Sixty. This is in addition to the numerous tables that are created by default and the different modules. This creates unbelievably long queries with so many joins that use so much memory and are so slow. I don&#8217;t know where that decision came from, but it feels like it was a bolted on solution. Drupal 7 breaks compatibility with previous versions in almost every regard, so why not just rewrite the storage paradigm?</p>
<h3>So which CMS then?</h3>
<p>I don&#8217;t have an answer. I started toying around with several Django based CMSes, Plone, and other Python solutions. All of them have their drawbacks, but mostly, they are far too elaborate and massive for the type of projects I do.</p>
<p>So I&#8217;m going to roll my own. I&#8217;m aware of the, &#8220;why reinvent the wheel,&#8221; argument, but if you&#8217;re never fully satisfied with the wheels available, why settle?</p>
<p>Currently, I&#8217;m thinking about a Python system with ZODB as a storage engine. But&#8230;</p>
<h3>Python library developers make me nervous</h3>
<p>The way I see it, there are 3 levels of  developers for most languages:</p>
<ol>
<li>The core developer &#8212; works on the language and/or standard library</li>
<li>The library/extension developer &#8212; creates projects to ease pains caused by the stdlib, to speed development along and to add specific functionality not provided by the stdlib. E.g., Django, Pyramid, SQLAlchemy, Zope, etc.</li>
<li>The application developer &#8212; works on projects that integrate various libraries to create something for end-users. E.g., Me.</li>
</ol>
<p>I&#8217;ve been working with Python since 2005 or 2006 &#8212; not a very long time. Yet I&#8217;ve picked so many libraries that have been superseded, barely updated, or flat out abandoned. To name a few, Spyced, Aquarium, Webware/WebKit, Pylons, Cheetah Templates, ZSI, PyXML, Glashammer, Google AppEngine, mod_python and who knows how many others.</p>
<p>Over the past 6 months, I&#8217;ve really had the Flask and Jinja projects grow on me. So much so that I started working on the aforementioned CMS with them. But over the same period of time, <a href="http://lucumr.pocoo.org">Armin Ronacher</a> &#8211; the creator of Flask, Werkzeug and Jinja  &#8211; has written some things on Python 3 and WSGI that make me worry that I once again bet on the wrong horse.</p>
<p>This isn&#8217;t meant to be a dig at him at all; I love the work that he&#8217;s done. I just can&#8217;t believe my luck. And it&#8217;s not just a Python thing. Javascript, C, Assembly, PHP&#8230;I&#8217;ve been picking libraries this way since I the days when I was starting a project with WinG just before DirectX started to break.</p>
<p>This is why I do not gamble or own stocks.</p>
<h3>Ding Dong, the wicked jQuery plugin site is dead</h3>
<p>jQuery is getting a new plugin site due to an &#8220;accident&#8221; on the existing site. I use &#8220;accident&#8221; loosely, because it was limping along, gasping for air, on life support, and mostly brain-dead since it&#8217;s inception. (Do note that the old plugin site ran Drupal)</p>
<p>The new plugin site will be based on github or something.</p>
<p>The point is I plan to have both of my plugins up there at some point.</p>
<h3>And that&#8217;s about it</h3>
<p>I&#8217;ll be back before 6 months&#8230; maybe :)</p>
]]></content:encoded>
			<wfw:commentRss>http://mattptr.net/2011/12/22/another-6-months-of-collected-thoughts/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Another batch of uncollected thoughts</title>
		<link>http://mattptr.net/2011/06/28/another-batch-of-uncollected-thoughts/</link>
		<comments>http://mattptr.net/2011/06/28/another-batch-of-uncollected-thoughts/#comments</comments>
		<pubDate>Tue, 28 Jun 2011 16:45:27 +0000</pubDate>
		<dc:creator>matt</dc:creator>
				<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[Web Apps]]></category>

		<guid isPermaLink="false">http://mattptr.net/?p=225</guid>
		<description><![CDATA[I haven&#8217;t really had any down time at work lately, and when I get home, I barely have any energy to think about programming. So this site has kind of fallen to the wayside. But, I do have some work related things I can blather on about. Django is still a pleasure to work with [...]]]></description>
			<content:encoded><![CDATA[<p>I haven&#8217;t really had any down time at work lately, and when I get home, I barely have any energy to think about programming. So this site has kind of fallen to the wayside. But, I do have some work related things I can blather on about.</p>
<h3>Django is still a pleasure to work with</h3>
<p>I developed a site a little over a year ago using Django (1.2) and just completed some updates. Even after updating this code multiple times, it hasn&#8217;t become messy. For content oriented sites, Django rocks. Plain and simple. For things more complex, core Django would probably stand up. For the admin tool, I&#8217;m not so sure. I couldn&#8217;t see it being good in every situation &#8212; if you have to customize it too much, you might be better off writing your own admin interface.</p>
<p>I&#8217;ve been managing the database schema with <a href="http://south.aeracode.org/">South</a> &#8212; which is probably one of the best programs I&#8217;ve ever used. Everything just works. I haven&#8217;t run in to an issue where an automatic update didn&#8217;t work, but then again, this site isn&#8217;t the most complicated site.</p>
<h3>Lots of people don&#8217;t understand Pyramid</h3>
<p>I have my own personal gripes with Pyramid. But over the last few months, I&#8217;ve seen a lot of posts looking for help understanding the &#8220;new&#8221; aspects of Pyramid: namely, the repoze/traversal/zope portions of it. The typical response is: <a href="http://www.reddit.com/r/Python/comments/i3fo6/as_an_experienced_pylons_dev_pyramid_is_melting/c20kw3h">ignore the parts you don&#8217;t understand</a>. So the Pyramid developers spent all this effort, all this time, to improve their framework, write documentation about how it&#8217;s better and how it&#8217;s different and how the <a href="http://docs.pylonsproject.org/projects/pyramid/1.0/designdefense.html">complaints users have about it are unfounded</a>, to say ignore key parts about it and to even go so far as to <a href="http://docs.pylonsproject.org/projects/akhet/dev/">write a wrapper</a> to make it more like the old framework.</p>
<p>Forget understanding the framework&#8230; I just don&#8217;t understand its developers.</p>
<h3>Do framework developers not understand framework users?</h3>
<p>I found <a href="http://reinout.vanrees.org/weblog/2011/06/08/whither-django.html#trends-to-watch">this article</a> a few weeks ago &#8212; a run down of a presentation by Russell Keith-Magee at Djangocon. My particular interest was in the following quote:</p>
<blockquote><p>Microframeworks. How on earth can an april fools joke like <a href="http://flask.pocoo.org/">Flask</a> get actual traction? Turn into a popular framework? Django is lots and lots smaller than zope, but these new ones are even smaller. What is small? What is micro? Could we adopt some? Can we become more attractive? We should think about this.</p></blockquote>
<p>It&#8217;s simple. Flask is better documented than 95.2187% of all the web frameworks available for Python. On the other hand, Django is documented better than 99% of all of them. When it comes to web programming, there&#8217;s no such thing as a silver bullet. Django is &gt; 6MB. Some people think that&#8217;s too big. Some people don&#8217;t like Django&#8217;s ORM and/or template system, and rather than spend effort changing it, would prefer to start with something that doesn&#8217;t force these things upon you.</p>
<p>Maybe instead of trying to figure out how it gained traction and how to apply &#8220;the marketing&#8221; of a project to your own, maybe try to figure out why users like it.</p>
<h3>In conclusion, WTF?</h3>
<p>Apparently, there is <a href="http://drupal.org/node/534594">a pretty serious bug</a> with Drupal 7.2. Also, <a href="http://drupal.org/node/1170312#comment-4547662">a fix has been found and applied in SVN/CVS/Whatever</a>. But, they are waiting for 7.3 I guess to roll out this fix?</p>
<p><img class="aligncenter" title="Sigh" src="http://i.imgur.com/whcRs.jpg" alt="" width="400" height="300" /></p>
<div id="_mcePaste" class="mcePaste" style="position: absolute; left: -10000px; top: 108px; width: 1px; height: 1px; overflow: hidden;"><span class="Apple-style-span" style="border-collapse: separate; color: #000000; font-family: 'Times New Roman'; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; font-size: medium;"><span class="Apple-style-span" style="font-family: arial, helvetica, clean, sans-serif; font-size: 13px; line-height: 16px; text-align: left;">&nbsp;</p>
<h1 style="margin-top: 1em; margin-right: 0px; margin-bottom: 1em; margin-left: 0px; font-size: 18px; font-weight: bold; padding: 0px;">Russell Keith-Magee</h1>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p></span></span></div>
]]></content:encoded>
			<wfw:commentRss>http://mattptr.net/2011/06/28/another-batch-of-uncollected-thoughts/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Thoughts collected over the last two and a half months</title>
		<link>http://mattptr.net/2011/01/19/thoughts-collected-over-the-last-two-and-a-half-months/</link>
		<comments>http://mattptr.net/2011/01/19/thoughts-collected-over-the-last-two-and-a-half-months/#comments</comments>
		<pubDate>Wed, 19 Jan 2011 20:45:23 +0000</pubDate>
		<dc:creator>matt</dc:creator>
				<category><![CDATA[Thoughts]]></category>

		<guid isPermaLink="false">http://mattptr.net/?p=209</guid>
		<description><![CDATA[Seeing as how I&#8217;m pretty unable to focus on anything these days, I have run up a collection of thoughts that have made their way through my damned brain. Anniversary This site turned 13 on 12/4/2010. Yes, I&#8217;m now in my terrible teens. Next thing you know, I&#8217;ll hate everything and complain about it. Oh [...]]]></description>
			<content:encoded><![CDATA[<p>Seeing as how I&#8217;m pretty unable to focus on anything these days, I have run up a collection of thoughts that have made their way through my damned brain.</p>
<p><strong>Anniversary</strong><br />
This site turned 13 on 12/4/2010. Yes, I&#8217;m now in my terrible teens. Next thing you know, I&#8217;ll hate everything and complain about it. Oh wait.</p>
<p><strong>Servers</strong><br />
I like Ubuntu as a server more than I like Debian. Specifically because they aren&#8217;t stuck on Python 2.5 and PHP 5.2. Which brings me to my next thought&#8230;</p>
<p><strong>Python on Linux sucks</strong><br />
<a href="http://sheddingbikes.com/posts/1285063820.html">I&#8217;m not the first to point this out</a>. It&#8217;s true though. Specifically, I&#8217;m referring to CentOS, which ships with Python 2.4. When I first started programming in Python, oh 5 years ago, I used Python 2.4. Having experienced installing a &#8220;side-by-side&#8221; version, which has to be done from source, I would rather be stabbed in the eye than do it again.</p>
<p><strong>That kind of crap makes it hard to get mod_wsgi working</strong><br />
And as mod_wsgi takes over in popularity, fcgi deployment examples start to become scarce.</p>
<p><strong>I actually tried out Pyramid</strong><br />
After all of my brain hemorrhaging when it was announced, I decided it may be worth a shot. Boy was I wrong. And by &#8220;wrong&#8221; I mean &#8220;right&#8221; (that I would hate it). It seems to be to unclean for me. I read a giant tutorial and ended up with something similar to Django (except it runs on Paste and you can use ZCML to configure it). Or so it seemed. I wasn&#8217;t sure where to put model definitions, or where to put what used to be &#8220;controller&#8221; code. I&#8217;m going to stick with Flask or Bottle.</p>
<p><strong>nginx is cool. I guess?</strong><br />
I&#8217;ve been setting up tons of VMs lately. On one I installed nginx, php-fpm and xcache to test out Drupal 7. It runs with great speed. There&#8217;s no per-directory or per-user configuration for nginx, so it&#8217;s pretty much useless as an Apache replacement. Chances of me ever using it outside of a VM are very slim.</p>
<p><strong>Tried out Redis, MongoDB and CouchDB</strong><br />
Redis isn&#8217;t related to the other two, but all 3 are something I&#8217;ll never get to use in real life.</p>
<p>That&#8217;s about it, I think. I did make a pretty neet web app recently, but I&#8217;m unsure how I can host it. Just something else to think about over the next 3 months.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattptr.net/2011/01/19/thoughts-collected-over-the-last-two-and-a-half-months/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Oh for the love of Jeff</title>
		<link>http://mattptr.net/2010/11/10/oh-for-the-love-of-jeff/</link>
		<comments>http://mattptr.net/2010/11/10/oh-for-the-love-of-jeff/#comments</comments>
		<pubDate>Thu, 11 Nov 2010 02:20:06 +0000</pubDate>
		<dc:creator>matt</dc:creator>
				<category><![CDATA[Python]]></category>
		<category><![CDATA[Thoughts]]></category>

		<guid isPermaLink="false">http://mattptr.net/?p=206</guid>
		<description><![CDATA[So in a rare moment of downtime today, I decided to check out the Pylons mailing list since I&#8217;m working on a *huge* project in Pylons. Somehow I missed that Pylons is merging with repoze.bfg. What does it mean? According to Ben Bangert, it will provide, &#8220;a way to easily extend a Pylons project.&#8221; What [...]]]></description>
			<content:encoded><![CDATA[<p>So in a rare moment of downtime today, I decided to check out the Pylons mailing list since I&#8217;m working on a *huge* project in Pylons. Somehow I missed that Pylons is merging with repoze.bfg.</p>
<p>What does it mean? According to <a href="http://be.groovie.org/post/1347858988/why-extending-through-subclassing-a-frameworks">Ben Bangert</a>, it will provide, &#8220;a way to easily extend a Pylons project.&#8221; What does extending mean here? I have no idea. Apparently, in the few Pylons apps I wrote, I&#8217;ve never said, &#8220;Gee this is great, but I need Pylons to do shit that it doesn&#8217;t need to do.&#8221; I guess?</p>
<p>The only thing I can think of is similar to Django&#8217;s reusable apps. If I was that worried about, I&#8217;d use Django.</p>
<p>Anyway, what does it mean to me? Another rewrite. I started using Pylons at 0.9.4. I wrote a project for it. Then 0.9.5 came out and I rewrote it for that. Then 0.9.6 came out and I rewrote parts again.</p>
<p>So now I&#8217;m in the middle of one of the biggest projects I&#8217;ve ever done &#8212; and I&#8217;ve staked my reputation on Pylons &#8212; and I&#8217;m finding out that I will need to rewrite it if I plan on using the next version of Pylons.</p>
<p>No. I&#8217;m not doing that. The Pylons 1.0 codebase isn&#8217;t going anywhere (it seems). But it won&#8217;t grow anymore either.</p>
<p>I will continue to write my application for Pylons. I don&#8217;t plan on using Pyramid. I don&#8217;t care how good the architecture is. I&#8217;m not worried that it uses parts of Zope. I don&#8217;t care how simple the hello world program is. I don&#8217;t care about extensibility. I don&#8217;t care that it&#8217;s &#8220;better&#8221; than micro-frameworks because they did something &#8220;wrong&#8221; or &#8220;evil&#8221; in their design.</p>
<p>I don&#8217;t hate Zope, I don&#8217;t have any bias against one web framework or another, and nothing in the <a href="http://docs.pylonshq.com/pyramid/dev/designdefense.html#">Defense Document</a> applies to me. I just&#8230; don&#8217;t&#8230; care.</p>
<p>I&#8217;m glad the Pylons devs will get around this wall they&#8217;ve run in to. Well&#8230; until the next wall and the next rewrite, the next merge, the next whatever. I&#8217;m staying away.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattptr.net/2010/11/10/oh-for-the-love-of-jeff/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Quick thought on SQLAlchemy</title>
		<link>http://mattptr.net/2010/09/15/quick-thought-on-sqlalchemy/</link>
		<comments>http://mattptr.net/2010/09/15/quick-thought-on-sqlalchemy/#comments</comments>
		<pubDate>Wed, 15 Sep 2010 20:36:36 +0000</pubDate>
		<dc:creator>matt</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[Pylons]]></category>
		<category><![CDATA[Sessions]]></category>
		<category><![CDATA[SQLAlchemy]]></category>

		<guid isPermaLink="false">http://mattptr.net/?p=188</guid>
		<description><![CDATA[Just a quick thought on SQLAlchemy today. Everyone that reads this blog surely knows that I think it is the champion of database libraries despite the fact that time and again they have ditched API backward-compatibility between minor releases (Yes open-source world, 0.x.y is still a minor release). It infuriates me to no end, because [...]]]></description>
			<content:encoded><![CDATA[<p>Just a quick thought on <a href="http://www.sqlalchemy.org">SQLAlchemy</a> today.</p>
<p>Everyone that reads this blog surely knows that I think it is the champion of database libraries despite the fact that time and again they have ditched API backward-compatibility between minor releases (Yes open-source world, 0.x.y is still a minor release). It infuriates me to no end, because I am usually effected and I have to nearly rewrite my application&#8230;. or blow it off and stay with an older version until the world itself stops turning.</p>
<p>Anyway, I&#8217;ve used SA for many types of applications, but using it in web apps always bothers me. Why? Use of the word <em>session.</em></p>
<p>Sessions, in web applications, almost universally refer to the user&#8217;s &#8220;session&#8221; &#8212; their persistent data, stored on the webserver. Sessions, in SA-land, are roughly the equivalent of a database connection, but they&#8217;re not exactly the same. It handles every aspect of communicating with the database server, in a very smart and efficient manner, I might add.</p>
<p>When developing web applications using SA, it can get confusing. Pylons can initiate a project with some code to start your database Session if you choose to do so. And if you plan on using SA in your Pylons application, why wouldn&#8217;t you do this? By default, the variable for the database session is <em>Session</em>. Note the capital letter. <em>session</em> is different &#8212; that&#8217;s where your persistent user-data goes.</p>
<p>It&#8217;s simple enough with Pylons to end the confusion and change the variable. I like using <em>dbs</em> (database session). But, now the Pylons documentation and user snippets will be different from your app &#8212; which kinda adds some of that confusion back. To make things worse, Pylons&#8217; docs usually refer to the 3rd party module docs &#8212; and in SA&#8217;s case, all of the examples in the documentation use the variable <em>session.</em></p>
<p>Obviously this isn&#8217;t a major problem. But imagine it&#8217;s the first thing in the morning, you were up late and you haven&#8217;t gotten your caffeine fix. This could be catastrophic.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattptr.net/2010/09/15/quick-thought-on-sqlalchemy/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Haven&#8217;t given up</title>
		<link>http://mattptr.net/2010/09/07/havent-given-up/</link>
		<comments>http://mattptr.net/2010/09/07/havent-given-up/#comments</comments>
		<pubDate>Tue, 07 Sep 2010 18:35:37 +0000</pubDate>
		<dc:creator>matt</dc:creator>
				<category><![CDATA[real life]]></category>
		<category><![CDATA[Thoughts]]></category>

		<guid isPermaLink="false">http://mattptr.net/?p=185</guid>
		<description><![CDATA[I haven&#8217;t given up on the projects that I&#8217;ve started recently. However, work has been pretty busy and looks to be getting busier. That means the jQuery Plugin Index, that I was in the midst of creating, has been stalled. I still *want* to do it. Very much so. However, I did notice that someone [...]]]></description>
			<content:encoded><![CDATA[<p>I haven&#8217;t given up on the projects that I&#8217;ve started recently. However, work has been pretty busy and looks to be getting busier. That means the jQuery Plugin Index, that I was in the midst of creating, has been stalled. I still *want* to do it. Very much so. However, I did notice that someone is working on <a href="http://pypi.appspot.com/">PyPi for Google App Engine</a>. If it gets fully implemented, I think it would be easy enough to fork that and adapt it.</p>
<p>In the meantime, <a href="http://mattptr.net/2010/07/28/building-python-extensions-in-a-modern-windows-environment/">my post</a> on building python extensions in Windows has gotten a lot of attention. I hope it helps people out, but believe it or not, I still have trouble building certain extensions. It&#8217;s especially painful if the extension depends on a library that doesn&#8217;t have native Windows support. But hopefully, this cuts down on the need for running and maintaining a VM just so you can code fun stuff with Python.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattptr.net/2010/09/07/havent-given-up/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Replacing the jQuery Plugin site</title>
		<link>http://mattptr.net/2010/07/09/replacing-the-jquery-plugin-site/</link>
		<comments>http://mattptr.net/2010/07/09/replacing-the-jquery-plugin-site/#comments</comments>
		<pubDate>Fri, 09 Jul 2010 17:26:11 +0000</pubDate>
		<dc:creator>matt</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[jquery]]></category>
		<category><![CDATA[new project]]></category>
		<category><![CDATA[plugins]]></category>
		<category><![CDATA[redo the jquery plugin site]]></category>

		<guid isPermaLink="false">http://mattptr.net/?p=172</guid>
		<description><![CDATA[I&#8217;ve touched on this a while back, but I never followed through with it. Having some down time at work, I&#8217;ve decided to jump in. I want to replace http://plugins.jquery.com. There are numerous problems with it, and one of the reasons I got overwhelmed by this project originally is because I wanted to fix the [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve touched on this a while back, but I never followed through with it. Having some down time at work, I&#8217;ve decided to jump in. I want to replace <a href="http://plugins.jquery.com">http://plugins.jquery.com</a>.</p>
<p>There are numerous problems with it, and one of the reasons I got overwhelmed by this project originally is because I wanted to fix the site, rather than replace it. So now, I&#8217;ve wised up and decided to start from scratch without considering any aspect of how the site currently works.</p>
<p>Here are the problem areas, as I see them, in no particular order.</p>
<h3>Browsing through plugins is ridiculously terrible</h3>
<p>First, when you get to the page, you get a bunch of categories. Compare this with <a href="http://pypi.python.org">http://pypi.python.org</a>. PyPi gives a tabular listing of 40 recently updated packages. For the Latest Releases, PJC (plugins.jquery.com) gives a full body of content for each item and goes on for a zillion pages.</p>
<p>Second, from the start page on PJC (with the category listing), the &#8220;Browse by Name&#8221; tab doesn&#8217;t work. The &#8220;Browse by Date&#8221; tab does work, but what date? The date the plugin was created, or the date of the last release? It turns out this is the same as the &#8220;Latest Releases&#8221; page, just the tab navigation at the top doesn&#8217;t disappear. The &#8220;All Plugins&#8221; link on the is the same as the &#8220;Browse by Name&#8221; tab and also doesn&#8217;t work.</p>
<p>Lastly, browsing plugins in a category gives a different layout from browsing by date. Why? It&#8217;s the same information, just sorted differently and filtered.</p>
<h3>Searching is basically useless</h3>
<p>Do you know why I&#8217;m surprised that people have actually used my timer plugin? Because I can&#8217;t even find it myself. Searching for &#8220;timer&#8221; yields 10 pages of results, and includes plain pages and issue tracker items.</p>
<p>I understand the appeal of having the bug tracker and plugin page tied together, but it&#8217;s terrible. A plugin like mine is so small that it doesn&#8217;t need a bug tracker. Not to mention that use of a bug tracker is annoying without the use of source control. The plugin author should bear the responsibility of setting up bug tracking, source control, etc. There are plenty of free sites to do that.</p>
<p>The search is easily bombed by adding keywords and tags (which are not moderated). So when I search for timer, the sixth result I get is for <a href="http://plugins.jquery.com/project/dualSlider">dualSlider</a> &#8212; perfect for managing timeouts and intervals.</p>
<h3>The Rating System</h3>
<p>There&#8217;s no point to this. The &#8220;Top Rated&#8221; plugins all have 1-3 votes. Plugins with more votes should have more clout. But it doesn&#8217;t really matter anyway. It&#8217;s not a popularity contest.</p>
<p>This particular part of PJC will have no part whatsoever in my new project. If there will be any spotlighting of plugins, it will be done by moderators.</p>
<h3>Other Data Formats</h3>
<p>Right now, there are no RSS feeds for plugins at all. Each plugin should have its own release feed, as well as a feed for all latest releases.</p>
<p>Writing a plugin manager currently would involve screen scraping the existing plugin page to see if there have been any changes. Of course, you have to know the URL of the plugin because searching basically gets no where, and if by some chance you were able to search, you&#8217;d have to scrape the search page as well.</p>
<p>That&#8217;s why I want to have everything available as JSON. Plugin details, list of plugins by category, search results&#8230; the new site has to be highly query-able. PyPi uses XML-RPC to expose their API. JSONRPC might be an option for this, or XML-RPC, but I&#8217;ll cross that bridge when I come to it.</p>
<h3>Categories</h3>
<p>The Categories on PJC are terrible. Not in the way that they aren&#8217;t descriptive, but they just suck. They should be hierarchical. For example, &#8220;Widgets&#8221; and &#8220;Windows and Overlays&#8221; could fall under &#8220;User Interface.&#8221; Menus could as well.</p>
<p>I&#8217;m not sure how Navigation and Menus are different.</p>
<p>DOM should probably be a child of Utilities.</p>
<p>I don&#8217;t know what AJAX means for a category. If the plugin is an AJAX request helper, it should go under &#8220;Utilities&#8221; or &#8220;jQuery Extension.&#8221; If it&#8217;s something like an auto-complete widget, well it should go under Widgets.</p>
<p>The point is, that categories aren&#8217;t very helpful in there current state. I put my Timer plugin under jQuery Extensions, Javascript, and Utilities, leading me to believe that they could all be the same category. I don&#8217;t know why Javascript is a category actually, since jQuery encapsulates, rather than extends.</p>
<h3>The New Site</h3>
<p>I&#8217;ve already started. <a href="http://code.google.com/p/jqpi">http://code.google.com/p/jqpi</a> (the app page will be http://jquerypi.appspot.com)</p>
<p>Basically, I want to create PyPi for jQuery plugins. I figured using Google App Engine would be nice. Also, knowing my penchant for dragging out projects, I&#8217;m coding it for HTML5, since it will probably be widely supported by the time I&#8217;m finished.</p>
<p>There are a few things that I don&#8217;t know how to do with GAE though. Hierarchical categories, searching, optimization, JSONRPC or XML-RPC. I&#8217;ll figure it out eventually, but help is always appreciated. Create an issue, create a wiki page, send patches, join the project, anything. We shouldn&#8217;t have to suffer the damned plugins.jquery.com any more.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattptr.net/2010/07/09/replacing-the-jquery-plugin-site/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Site Crawler Chronicles &#8211; Part 3</title>
		<link>http://mattptr.net/2010/03/19/the-site-crawler-chronicles-part-3/</link>
		<comments>http://mattptr.net/2010/03/19/the-site-crawler-chronicles-part-3/#comments</comments>
		<pubDate>Fri, 19 Mar 2010 14:33:05 +0000</pubDate>
		<dc:creator>matt</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Thoughts]]></category>

		<guid isPermaLink="false">http://mattptr.net/?p=150</guid>
		<description><![CDATA[I managed to find a solution for the problem I had yesterday, though I don&#8217;t particularly know if it&#8217;s ideal. Originally I had thought that I would need to store the entire hierarchy of the site in a tree like structure. I figured I could just store a list of the links on a page [...]]]></description>
			<content:encoded><![CDATA[<p>I managed to find a solution for the problem I had yesterday, though I don&#8217;t particularly know if it&#8217;s ideal.</p>
<p>Originally I had thought that I would need to store the entire hierarchy of the site in a tree like structure. I figured I could just store a list of the links on a page in a dict structure and then output all of the errors when the crawl was finished. I don&#8217;t know why I was hung up on the idea that errors had to be reported as they were come across.</p>
<p>I was worried that memory use would be a factor, but it seems to be ok.</p>
<p>But there&#8217;s another issue:</p>
<pre>    #taken from lxml.html.__init__
    def make_links_absolute(self, base, root):
        """This function exists because urljoin behaves obnoxiously.
        For example, if I'm on the page:
            http://www.example.com/some/directory/index.html, or just:

http://www.example.com/some/directory/

        And I join the relative URL: ../../abc.html
        I end up with: http://www.example.com/abc.html

        *But*
        If I'm on: http://www.example.com/some/directory  [no trailing slash]
        I end up with: http://www.example.com/../abc.html
        """</pre>
<p>My fix for it was stripping out one &#8220;../&#8221;. Yesterday I thought that it would be a good fix. Today, I can&#8217;t figure out why I thought it would fix all cases.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattptr.net/2010/03/19/the-site-crawler-chronicles-part-3/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Site Crawler Chronicles</title>
		<link>http://mattptr.net/2010/03/18/the-site-crawler-chronicles/</link>
		<comments>http://mattptr.net/2010/03/18/the-site-crawler-chronicles/#comments</comments>
		<pubDate>Thu, 18 Mar 2010 18:04:32 +0000</pubDate>
		<dc:creator>matt</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Thoughts]]></category>

		<guid isPermaLink="false">http://mattptr.net/?p=148</guid>
		<description><![CDATA[So I managed to stop v4 of my web crawler from opening up a billion connections in parallel. Turns out that gevent has a Pool object and that was exactly what I needed. Now my little script (137 lines, including a utility object and comments) will not be a sysadmin&#8217;s nightmare. However, I now have a [...]]]></description>
			<content:encoded><![CDATA[<p>So I managed to stop v4 of my web crawler from opening up a billion connections in parallel. Turns out that <a href="http://www.gevent.org">gevent</a> has a <a href="http://www.gevent.org/gevent.pool.html">Pool</a> object and that was exactly what I needed.</p>
<p>Now my little script (137 lines, including a utility object and comments) will not be a sysadmin&#8217;s nightmare.</p>
<p>However, I now have a new problem. I described how the older versions work in my <a href="http://mattptr.net/2010/03/17/new-old-ideas/">previous post</a>, but this version is quite a bit different. Instead of using a queue or stack data structure to figure out where to go next, this version has a greenlet scrape all links from a page, filters out stuff it&#8217;s already been to, then returns the rest. The main thread then accumulates the lists when all greenlets are finished. After the accumulation &#8212; and it&#8217;s ensured that there are no duplicate links &#8212; the main thread then spawns a greenlet <em>for each link</em> and the main thread waits until the greenlets finish again. When there are no links returned by the greenlets, the main thread is done, and the script terminates.</p>
<p>The problem is, if there&#8217;s a 404 or some kind of error retrieving the page, I have no way of knowing what page that link was found on.</p>
<p>The only solution that I see is using a custom data structure and hope that it doesn&#8217;t kill performance.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattptr.net/2010/03/18/the-site-crawler-chronicles/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>New Old Ideas</title>
		<link>http://mattptr.net/2010/03/17/new-old-ideas/</link>
		<comments>http://mattptr.net/2010/03/17/new-old-ideas/#comments</comments>
		<pubDate>Wed, 17 Mar 2010 21:05:37 +0000</pubDate>
		<dc:creator>matt</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Thoughts]]></category>

		<guid isPermaLink="false">http://mattptr.net/?p=145</guid>
		<description><![CDATA[A little while ago, I wrote a post in which I was trying to figure out a way to improve a Web Crawler script I had written &#8212; it was one of those that I never published. Anyway, for some reason, I wrote it using Stackless Python, but I was pretty novice and didn&#8217;t make [...]]]></description>
			<content:encoded><![CDATA[<p>A little while ago, I wrote a post in which I was trying to figure out a way to improve a Web Crawler script I had written &#8212; it was one of those that I never published.</p>
<p>Anyway, for some reason, I wrote it using <a href="http://www.stackless.com">Stackless Python</a>, but I was pretty novice and didn&#8217;t make it as efficient as I could. This was version 1, and it basically went to each page, scraped all the valid links (ie, those that were on the same server and not a mailto: or something) and then went through recursively.</p>
<p>Version 2 was basically the same, just with cleaner code and no recursion. I decided to set up the library so I would extend the SiteCrawler class and get notified of what was going on through callbacks. While it wasn&#8217;t any faster than version 1, it did seem a bit more stable.</p>
<p>Version 3 I decided to change drastically and made it multithreaded. It is much, much, much faster. It works like this, there&#8217;s an input Queue, and an already checked Queue. There are 4 threads waiting for input on the input Queue, when they get it, they scrape the links, check to see if any of them are in the checked Queue, put what&#8217;s been filtered on the input Queue, and put what it just checked on the checked Queue. It seems more complicated than it is. Also the code is more complicated than it needs to be.</p>
<p>Anyway, version 3 works well for me when I need to test a site. It&#8217;s saved me so much time in going through and checking for broken URLs. There are a few clients with the number of pages on their site in the 300&#8242;s.</p>
<p>But, I recently found out about <a href="http://www.gevent.org">gevent</a>, and since I have some free time at work, I wanted to play with it a little bit. If you don&#8217;t know, gevent is a package that works with the <a href="http://pypi.python.org/pypi/greenlet">greenlet</a> package on top of libevent. I&#8217;m always interested in concurrent programming, and new technologies involved in it. This is why I had installed Stackless at one time.</p>
<p>So now there&#8217;s a version 4 of the SiteCrawler script, using &#8212; you guessed it &#8212; gevent. I haven&#8217;t ironed out all of the kinks yet. I was testing a non-pooled version of the script and it basically crawled through 200 links in a matter of seconds &#8212; hopefully none of the server admins look at the logs and see 100 simultaneous connections at 3:30 PM today. I did change how the crawl is done quite a bit too. So I&#8217;m going to stop there and probably have more tomorrow or in the next few days.</p>
<p>Good stuff!</p>
]]></content:encoded>
			<wfw:commentRss>http://mattptr.net/2010/03/17/new-old-ideas/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

