<?xml version='1.0' encoding='utf-8' ?>
<!--  If you are running a bot please visit this policy page outlining rules you must respect. http://www.livejournal.com/bots/  -->
<rss version='2.0' xmlns:lj='http://www.livejournal.org/rss/lj/1.0/' xmlns:media='http://search.yahoo.com/mrss/' xmlns:atom10='http://www.w3.org/2005/Atom'>
<channel>
  <title>LotSo&apos;s OSS World</title>
  <link>http://lotso.livejournal.com/</link>
  <description>LotSo&apos;s OSS World - LiveJournal.com</description>
  <lastBuildDate>Sun, 05 Apr 2009 12:22:01 GMT</lastBuildDate>
  <generator>LiveJournal / LiveJournal.com</generator>
  <lj:journal>lotso</lj:journal>
  <lj:journalid>4034558</lj:journalid>
  <lj:journaltype>personal</lj:journaltype>
  <atom10:link rel='hub' href='http://pubsubhubbub.appspot.com/' />
  <image>
    <url>http://l-userpic.livejournal.com/21859311/4034558</url>
    <title>LotSo&apos;s OSS World</title>
    <link>http://lotso.livejournal.com/</link>
    <width>100</width>
    <height>75</height>
  </image>

<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/107348.html</guid>
  <pubDate>Sun, 05 Apr 2009 12:22:01 GMT</pubDate>
  <title>Postgresql 8.4 -&amp;gt; Where are On Disk Bitmap Indexes?</title>
  <link>http://lotso.livejournal.com/107348.html</link>
  <description>Postgresql 8.4 is nearly out. There&apos;s quite a few things which looks interesting to me. However, the one thing which I&apos;m still missing and am not able to find the status of is where or what happened to the On-Disk-Bitmap-Indexes which was supposed to come out for the 8.4 release.&lt;br /&gt;&lt;br /&gt;Anyone from the Postgreql SQL Team would be privy to that info? can&apos;t really seem to find it on google.&lt;br /&gt;&lt;br /&gt;Thanks.</description>
  <comments>http://lotso.livejournal.com/107348.html</comments>
  <category>linux</category>
  <category>sql</category>
  <lj:mood>aggravated</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>9</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/107221.html</guid>
  <pubDate>Sun, 09 Nov 2008 15:16:34 GMT</pubDate>
  <title>FC10 VPN Setup to Windows PPTP using Network Manager</title>
  <link>http://lotso.livejournal.com/107221.html</link>
  <description>I was going through this for a couple of hours as I was trying to configure FC10 (moved from gentoo [for now]) and NetworkManager to connect to my office&apos;s VPN server running on Windows(R) ISA server.&lt;br /&gt;&lt;br /&gt;I tried a variety of methods and configurations but I keep getting errors of the sort &lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;Dec 26 15:02:00 localhost pppd[5483]: Plugin /usr/lib/pppd/2.4.4/nm-pptp-pppd-plugin.so loaded.&lt;br /&gt;Dec 26 15:02:00 localhost pppd[5483]: pppd 2.4.4 started by root, uid 0&lt;br /&gt;Dec 26 15:02:00 localhost pptp[5484]: nm-pptp-service-5480 log[main:pptp.c:314]: The synchronous pptp option is NOT activated&lt;br /&gt;Dec 26 15:02:03 localhost pptp[5493]: nm-pptp-service-5480 warn[ctrlp_disp:pptp_ctrl.c:956]: Non-zero Async Control Character Maps are not supported!&lt;br /&gt;Dec 26 15:02:09 localhost pppd[5483]: MS-CHAP authentication failed: E=691 Authentication failure&lt;br /&gt;Dec 26 15:02:09 localhost pppd[5483]: CHAP authentication failed&lt;br /&gt;Dec 26 15:02:10 localhost pppd[5483]: Connection terminated.&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Authentication errors, but the issue was WHY?&lt;br /&gt;&lt;br /&gt;Ends up, and this may not apply to you, but it definitely applies to me. &lt;br /&gt;&lt;br /&gt;My office is using Windows ISA server (200x version I would think)&lt;br /&gt;&lt;br /&gt;So, in Network Manager, there are 3 option boxes&lt;br /&gt;&lt;br /&gt;Username&lt;br /&gt;Password&lt;br /&gt;Domain&lt;br /&gt;&lt;br /&gt;so, I happily added&lt;br /&gt;&lt;br /&gt;Username : lotso&lt;br /&gt;Password : lotso&apos;s password&lt;br /&gt;domain : lotso&apos;s windows domain&lt;br /&gt;&lt;br /&gt;ended up I get those errors above.!!&lt;br /&gt;&lt;br /&gt;......&lt;br /&gt;.....&lt;br /&gt;2 hours later and MUCH googling&lt;br /&gt;.....&lt;br /&gt;....&lt;br /&gt;&lt;br /&gt;I tried&lt;br /&gt;Username : lotso&apos;s windows domain\lotso&lt;br /&gt;password : lotso&apos;s password&lt;br /&gt;domain : &lt;blank&gt;&lt;br /&gt;&lt;br /&gt;and it WORKED!!&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Ideosyncracies!!&lt;br /&gt;&lt;br /&gt;on the other hand, this is _not_ a bug w/ NM or PPTP, seems like per digitalwound, his one works fine w/ those 3 options</description>
  <comments>http://lotso.livejournal.com/107221.html</comments>
  <category>linux</category>
  <lj:security>public</lj:security>
  <lj:reply-count>5</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/107001.html</guid>
  <pubDate>Sat, 08 Nov 2008 18:31:49 GMT</pubDate>
  <title>Foss,My 2008 Pictures</title>
  <link>http://lotso.livejournal.com/107001.html</link>
  <description>nowadays, the blogs on OSS&amp;nbsp;is really little for my part.&lt;br /&gt;&lt;br /&gt;Here&apos;s some pics instead&lt;br /&gt;&lt;br /&gt;http://flickr.com/photos/lotso&lt;br /&gt;&lt;br /&gt;&lt;img src=&quot;http://farm4.static.flickr.com/3168/3013497190_1a26e36fe4_m.jpg&quot; alt=&quot;&quot; /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;</description>
  <comments>http://lotso.livejournal.com/107001.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/106681.html</guid>
  <pubDate>Tue, 12 Aug 2008 15:11:21 GMT</pubDate>
  <title>python-2.4 to python 2.5 upgrade hell</title>
  <link>http://lotso.livejournal.com/106681.html</link>
  <description>It&apos;s been a while since the last post.&lt;br /&gt;&lt;br /&gt;Not much been happening except that I&apos;m missing out on life due to work commitments which seriously sucks.&lt;br /&gt;&lt;br /&gt;In any case, been battling with the python upgrade on my gentoo box at home.&lt;br /&gt;&lt;br /&gt;I&apos;ve been having undefined symbol issues with pygtk and pygobject all through last week and I just solved my issue like 10 min ago and thus I can go to sleep already.&lt;br /&gt;&lt;br /&gt;Main issue is the borking when I was compiling a python app which needed gtk support. Only thing, it, it also needed &quot;threads&quot; support which, for some reason, is not default turned on in Python2.5 but is on python2.4!&lt;br /&gt;&lt;br /&gt;Hence, I spent the last week pulling out hairs trying all different permutations of WHY did it USED to work and doesn&apos;t now.&lt;br /&gt;&lt;br /&gt;it ended up as a simple USE flag which was not checked during the ebuild checks.&lt;br /&gt;&lt;br /&gt;Why.. Oh.. Why...</description>
  <comments>http://lotso.livejournal.com/106681.html</comments>
  <category>linux</category>
  <category>gentoo</category>
  <lj:security>public</lj:security>
  <lj:reply-count>8</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/106296.html</guid>
  <pubDate>Mon, 23 Jun 2008 01:32:15 GMT</pubDate>
  <title>Automatic Raid Array Rebuilding</title>
  <link>http://lotso.livejournal.com/106296.html</link>
  <description>Hi guys, long time no post. Last post was at March and it&apos;s now already June.&lt;br /&gt;&lt;br /&gt;Been busy as usual, however, not been dabbling as much as I &quot;should&quot; as I&apos;ve been busy with other NON-FOSS related stuffs. (psst: I&apos;m now heavily into photography. Went to shoot some Japan GT queens!! Kawaaiii)&lt;br /&gt;&lt;br /&gt;Anyway, since this is a (nearly) purely an FOSS based blog, I&apos;m gonna talk about my automatic Raid Rebuilding script.&lt;br /&gt;&lt;br /&gt;You see, what happens is this, my postgresql box, (celeron 2x500GB in Raid 1) has a tendency to keep dieing once in a while for X reasons. (I have till now, been unable to locate the reason why it&apos;s dieing so often) I&apos;ve tried to the write-all, read-all using dd but thus far, has not seen errors being thrown out. So, it&apos;s been a manual instance of...&lt;br /&gt;&lt;br /&gt;go to work. see the email : Your raid has Died!&lt;br /&gt;log onto the box, do the rebuild.&lt;br /&gt;&lt;br /&gt;After a while, this just becomes tiring and I decided to fsck it and make it automatic.&lt;br /&gt;&lt;br /&gt;Here&apos;s the script&lt;br /&gt;&lt;br /&gt;#!/bin/bash&lt;br /&gt;&lt;br /&gt;FAIL_DRV=`mdadm --detail /dev/md0 | grep faulty | awk &apos;{print $6}&apos;`&lt;br /&gt;&lt;br /&gt;if [ -n &quot;$FAIL_DRV&quot; ]&lt;br /&gt;then&lt;br /&gt;&amp;nbsp; echo &quot;Detected degraded array : $FAIL_DRV&quot;&lt;br /&gt;&amp;nbsp; echo &quot;Starting automated array rebuild process&quot;&lt;br /&gt;&amp;nbsp; mdadm /dev/md0 --fail $FAIL_DRV --remove $FAIL_DRV --add $FAIL_DRV&lt;br /&gt;else&lt;br /&gt;&amp;nbsp; echo &quot;Nothing to do&quot;&lt;br /&gt;fi&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Simple eh.. &lt;br /&gt;&lt;br /&gt;So, now I don&apos;t have to come to work to see it all wonky because it&apos;ll automatically rebuild itself.&lt;br /&gt;&lt;br /&gt;Some of you may ask, how come I don&apos;t just replace the drive? Because I can&apos;t find any replacement drive which is a PATA connection and at 500GB capacity! The largest I can find are 160GB.&lt;br /&gt;&lt;br /&gt;Bummer</description>
  <comments>http://lotso.livejournal.com/106296.html</comments>
  <category>linux</category>
  <category>sql</category>
  <lj:security>public</lj:security>
  <lj:reply-count>2</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/106139.html</guid>
  <pubDate>Fri, 04 Apr 2008 21:00:51 GMT</pubDate>
  <title>GVFS makes me happy</title>
  <link>http://lotso.livejournal.com/106139.html</link>
  <description>Did you know that nautilus is now integrated with the new GVFS (&lt;s&gt;gentoo&lt;/s&gt;Gnome virtual filesystem) from the older gnomeVFS module?&lt;br /&gt;&lt;br /&gt;The new one is partially built on top of fuse, or rather integrates with fuse and it makes mounting and accessing files from network shares a much better experience than it was previously.&lt;br /&gt;&lt;br /&gt;GVFS, with nautilus, when you browse to a share, it’s automatically mounted under ~/.gvfs&lt;br /&gt;&lt;pre&gt;&lt;code&gt;
gvfs-fuse-daemon on /home/gentoo/.gvfs type fuse.gvfs-fuse-daemon (rw,nosuid,nodev,user=gentoo)

~/.gvfs $ ls -al
total 12
dr-x------   3 gentoo users     0 Apr  5 03:38 .
drwx------ 118 gentoo users 12288 Apr  5 04:37 ..
drwx------   1 gentoo users     0 Feb  1 21:20 mediacenter on 192.168.10.111
&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;and thus, you can actually use applications to access those items under those mount points.&lt;br /&gt;&lt;br /&gt;for example, one of the reasons I rely heavily on totem and nautilus is because of its SMB support. Whereby I can connect to my home mediacenter server and stream videos from it without going through the motions of actually mounting the drive/shares.&lt;br /&gt;&lt;br /&gt;(did you know that only totem and nautilus share this feature in gnome and the rest of the programs are brain-dead in this regard?)&lt;br /&gt;&lt;br /&gt;Kinks :&lt;br /&gt;Currently, based on my limited testing (it’s 4+am as I write this as I was playing with GVFS+Nautilus 2.22), there are bugs when you try to access a smb share which is password protected. For X reasons, the usual nautilus “password verification” box does not come up.&lt;br /&gt;&lt;br /&gt;And putting smb://user:password@server/share does not work either.&lt;br /&gt;&lt;br /&gt;But if you were to drop down to the CLI and do a&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;
$ gvfs-mount smb://gentoo@192.168.10.2/storagePassword required for share storage on 192.168.10.2
Domain [HOME.NET]:
Password:
&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;then that would work and you’ll get the mountpoint in ~/.gvfs and you can access files from that location.&lt;br /&gt;It’s an additional step, but hey, at least now it’s&lt;br /&gt;&lt;br /&gt;1. transparent (to an extent) and most apps can see it (tried mplayer/gmplayer/xine)&lt;br /&gt;2. FAST. It’s pretty much faster than the previous incarnation of using smb protocol through nautilus. (I don’t have any real stats)&lt;br /&gt;3. Have a tendency to crash.</description>
  <comments>http://lotso.livejournal.com/106139.html</comments>
  <category>linux</category>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/105915.html</guid>
  <pubDate>Thu, 06 Mar 2008 00:36:27 GMT</pubDate>
  <title>Steps to secure your site</title>
  <link>http://lotso.livejournal.com/105915.html</link>
  <description>So, in today’s lesson I will elaborate on how 1 site decides to put additional “protection” towards phishing or in a more general term, how to secure your site against malware or other badwares.&lt;br /&gt;&lt;br /&gt;1. Open an account with RHBBank (rhbbank.com.my)&lt;br /&gt;2. Subscribe to internet banking&lt;br /&gt;3. Go Overseas&lt;br /&gt;4. Attempt to pay your credit card fees etc via internet&lt;br /&gt;5. Pull hairs in attempts&lt;br /&gt;&lt;br /&gt;So basically, I’ve been trying to access to RHBbank’s secure site (&lt;a href=&apos;https://logon.rhbbank.com.my/&apos; rel=&apos;nofollow&apos;&gt;https://logon.rhbbank.com.my/&lt;/a&gt;) and keep getting either permission denied or server errors or something along those lines.&lt;br /&gt;&lt;br /&gt;So, in an off-hunch, I tunnelled to my home squid proxy server and used that as the proxy for firefox. I fired up the browser and was greeted with the RHB secure page!!&lt;br /&gt;&lt;br /&gt;Open up opera, (normal settings) and fire up the same page and “internal server error”&lt;br /&gt;&lt;br /&gt;So, either one of two things is happening.&lt;br /&gt;&lt;br /&gt;1. RHB is looking at IP addresses and denying access to anyone out of M’sia IP address range&lt;br /&gt;2. My Company’s outgoing filter regards RHBbank as malware etc and prohibits me to visit it.&lt;br /&gt;&lt;br /&gt;funny business.</description>
  <comments>http://lotso.livejournal.com/105915.html</comments>
  <category>rants</category>
  <lj:mood>aggravated</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>1</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/105481.html</guid>
  <pubDate>Thu, 28 Feb 2008 06:31:26 GMT</pubDate>
  <title>Open Source Trend - M&apos;sia 2nd in line</title>
  <link>http://lotso.livejournal.com/105481.html</link>
  <description>I just noticed this from google trends.&lt;br /&gt;&lt;br /&gt;&lt;a href=&apos;http://www.google.com/trends?q=open+source&amp;ctab=0&amp;geo=all&amp;date=all&amp;sort=0&apos; rel=&apos;nofollow&apos;&gt;http://www.google.com/trends?q=open+source&amp;ctab=0&amp;geo=all&amp;date=all&amp;sort=0&lt;/a&gt;</description>
  <comments>http://lotso.livejournal.com/105481.html</comments>
  <category>linux</category>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/105424.html</guid>
  <pubDate>Tue, 05 Feb 2008 01:48:54 GMT</pubDate>
  <title>Location..Location..Location</title>
  <link>http://lotso.livejournal.com/105424.html</link>
  <description>I’m in San Jose. Still pondering if I can make it to the Local PUG (postgresql user group) meeting to be held on Feb 12 since I’m here.&lt;br /&gt;&lt;br /&gt;Will get the chance to meet David Fetter and team.&lt;br /&gt;&lt;br /&gt;I’ll see what happens.&lt;br /&gt;&lt;br /&gt;PS : I freaking hate it here this time of year. It’s cold and so are my fingers! I need to constantly rub my hands together</description>
  <comments>http://lotso.livejournal.com/105424.html</comments>
  <category>linux</category>
  <category>sql</category>
  <lj:security>public</lj:security>
  <lj:reply-count>6</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/104993.html</guid>
  <pubDate>Sun, 27 Jan 2008 09:51:27 GMT</pubDate>
  <title>You say Lemon, I say Lemonade (A story)</title>
  <link>http://lotso.livejournal.com/104993.html</link>
  <description>The past few weeks was not all that great as in addition to facing additional challenges at my primary day job, I also had to deal with my pet project in my day job to help smoothen my day job’s activities.&lt;br /&gt;&lt;br /&gt;Some of you may know that my pet project involves pulling gobs of data into a PG instance to make my own version of a company datamart. I’m not talking about small gobs of data, but more towards in the range of 200+GB (It was more, but in one of the efforts to control/tune the server, I deleted close to 2-3 month’s worth of data.)&lt;br /&gt;&lt;br /&gt;200+GB may not seem like much to you guys who gets to play with some real iron hardware or some “real” server hardware. All I had was just a Celeron 1.7G w/ 768MB of ram and some Gobs of IDE 7200 RPM drives. In short, all I had was lemons and I needed to make the best of it!&lt;br /&gt;&lt;br /&gt;Actually, all was working fine and dandy up until I decided to make a slave server using Slony-I + PGpool and while that was a good decision, the involved hardware was the same if not worst(512MB ram only). When I started to implement that, I was faced with 2 issues.&lt;br /&gt;&lt;br /&gt;1. Replication would lag behind by up to a day or so waiting for the next sync (dreaded fetch 100 from log) was taking to long.&lt;br /&gt;2. My nightly vacuum job went from an average of 4+ hours to like 27+ hours.&lt;br /&gt;&lt;br /&gt;So, in a effort to get things under control, I went through a few paths and hit more than my share of stumbling blocks. One of the things which I tried was to reduce the amount of “current” data in a particular table from 1 month -&amp;gt; 2 weeks -&amp;gt; 1 week (and move them into a so-called archive table but still in the same tablespace).  This didn’t really bode well, as I initially tried to move the data in like 3 hourly chunks, which failed and to 1 hour chunks and then finally to 15 minutes chunks. &lt;br /&gt;&lt;br /&gt;But in the end, it was all really futile because what i was essentially doing was just generate more and more IO activity (and that’s not a good thing). In addition to that, I also had to deal with vacuuming the tables due to PG’s MVCC feature and that was also not fun.&lt;br /&gt;&lt;br /&gt;So, in the end, I broke my 3x500GB Raid 1 mirror (1 spare disk) and used the spare as the Slony-I log partition. Initially, that wasn’t all I did, I also included the 2 main problematic table, moving it from the main raid1 tablespace into that 1 disk tablespace. (that was also a mistake) and it didn’t help at all. IO activity was still high and I wasn’t able to solve my vacuuming process as wel.&lt;br /&gt;&lt;br /&gt;Time for another plan.&lt;br /&gt;&lt;br /&gt;This time around, what i did was to move the 2 big tables back into the raid1 tablespace and left the slony logs in the single disk. In addition to that, I also made a few alterations to the manner in which I pull data from the main MSSQL database and the way it was inserted into PG. &lt;br /&gt;&lt;br /&gt;This time around, I’m utilising partitioning and some additional pgagent rules to automatically switch into a new table every 7 days and in doing so, I also had to change a few more other items to get things to work smoothly. I did this last Friday and based on the emailed logs, I think I’ve made a good decision as right now, everything seems peachy with the vacuum back to ~4 hours and there’s also no lag in the Slony replication.&lt;br /&gt;&lt;br /&gt;I still hav another thing to do which is to alter the script I use to pull from the main Db as I’m being kicked (requested) to pull from an alternate DB which has a slightly different architecture.&lt;br /&gt;&lt;br /&gt;2 disk Raid1 is definitely MUCH better than a single disk tablespace. With the amount of read/write activity that i have, it’s just not doable.&lt;br /&gt;&lt;br /&gt;So, that’s how I made lemonade with my lemons. (hmm.. does this sound right?)</description>
  <comments>http://lotso.livejournal.com/104993.html</comments>
  <category>linux</category>
  <category>sql</category>
  <lj:security>public</lj:security>
  <lj:reply-count>2</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/104934.html</guid>
  <pubDate>Sat, 12 Jan 2008 12:39:40 GMT</pubDate>
  <title>Postgresql 8.3 Features I&apos;m looking forward to</title>
  <link>http://lotso.livejournal.com/104934.html</link>
  <description>PG 8.3 is coming along soon. (although I read from Bruce M that there&apos;s likely to be RC2 coming out).&lt;br /&gt;&lt;br /&gt;In any case, I looked through the &lt;a href=&quot;http://developer.postgresql.org/index.php/WhatsNew83&quot; rel=&quot;nofollow&quot;&gt;pgwiki&lt;/a&gt; and there looks like only 2 features which I&apos;m looking forward to.&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;HOT&lt;/li&gt;&lt;li&gt;Create table like including indexes (although right now, this is being automated via a stored procedure/function)&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;The other thing which is nice, but not absolutely necessary is the multiple Autovacuum worker feature. My concern is largely on the few very large tables which I used to have. (I&apos;ve since sliced it down to partitions by date ranges to keep it manageable. I initially just wanted to see how _much_ data it can cope with before my system** starts to bog down. BTW, It turned out to be approx 200 million rows, and Now I know)&lt;br /&gt;&lt;br /&gt;Of late, the nightly vacuum has been taking a long time and this is in part, a fault of mine due to a design issue. I won&apos;t go so much into this, but know&amp;nbsp; that I need to relook into my current ETL implementation and where the data goes into the Db.&lt;br /&gt;&lt;br /&gt;As of right now, I&apos;m pulling data from a MSSQL server into PG to be made as a data-mart. My current process involves pulling from MSSQL into a table in PG. Unlike the usual method of making a partition, namely a master table w/o holding any data or insert directly into&amp;nbsp; the partition, I chose to insert into&amp;nbsp; the master table, and then, 1 week later (I started with 1 month then 2 weeks and ended up with 1 week&apos;s worth of current data in the master table) I start to offload data from the master table into the partition.&lt;br /&gt;&lt;br /&gt;Master Table (1 wk data)&lt;br /&gt;-&amp;gt;partition_200710&lt;br /&gt;-&amp;gt;partition_200711&lt;br /&gt;-&amp;gt;partition_200712&lt;br /&gt;&lt;br /&gt;I was looking through my system&apos;s load and found that it&apos;s always on IO wait. Performing a vacuum on the large table after the data offload into the partition took quite a while due to &lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;The table is large&lt;/li&gt;&lt;li&gt;The indexes are sometimes even larger than the table size&lt;/li&gt;&lt;li&gt;The number of indexes in that table&lt;/li&gt;&lt;li&gt;My usage of a concatenated prikey named as unique_id to simplify the loading process which ended up being a bad decision because I needed to create the same prikey (non-concatenated as an index) anyway to improve join performance. Hence, in some sense, i have double the amount to vacuum through. Bad. Bad. (David Fetter warned me of this but I chose to shoot myself in the foot anyway.)&lt;/li&gt;&lt;/ol&gt;So, I figured that by reducing the amount of data in that particular table, I could well reduce the amount of time being spent in vacuuming that particular table. (Note that I don&apos;t know how true is this hypothesis of mine, but I&apos;m giving it a shot anyhow.)&lt;br /&gt;&lt;br /&gt;Note: I&apos;m looking forward to 8.4, which I don&apos;t really know when, but I&apos;m hoping that by then, (on disk) bitmap indexes will be made available and my (multiple) indexes can be made to be smaller and more efficient. (up to 8 index on a table)&lt;br /&gt;&lt;br /&gt;** : The system in question is a celeron 1.7G/768MB RAM and 2x500GB Raid 1 w/ ~250GB DB size</description>
  <comments>http://lotso.livejournal.com/104934.html</comments>
  <category>linux</category>
  <category>sql</category>
  <lj:security>public</lj:security>
  <lj:reply-count>3</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/104504.html</guid>
  <pubDate>Thu, 10 Jan 2008 09:11:36 GMT</pubDate>
  <title>The Doraemons Conversations</title>
  <link>http://lotso.livejournal.com/104504.html</link>
  <description>After waiting for such a long time and after the long wait to compile QT and also skype 2.0 (beta download) I finally gotten the el-cheapo webcam which I’ve gotten from nearly a year ago to work. (actually, I think it was longer than that, it was during version 1.3 IIRC of skype)&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;http://www.flickr.com/photos/lotso/2182061881/&quot; title=&quot;dsc00210-small by lotso, on Flickr&quot; rel=&quot;nofollow&quot;&gt;&lt;img src=&quot;http://farm3.static.flickr.com/2023/2182061881_b3f5174065_o.jpg&quot; width=&quot;240&quot; height=&quot;180&quot; alt=&quot;dsc00210-small&quot; /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The driver used was the gscpav1 and for the webcam to work, you have to enable video4linux support in the kernel (which I didn’t bother to since skype doesn’t support video in the olden days anyway.)&lt;br /&gt;&lt;br /&gt;So, after all that, I now have Skype for Linux working. (didn’t test sound though)&lt;br /&gt;&lt;br /&gt;But I’m happy that I don’t need to spend &lt;a href=&quot;http://www.bytebot.net/blog/archives/2008/01/05/skype-video-and-a-logitech-webcam&quot; rel=&quot;nofollow&quot;&gt;RM65 to get another webcam&lt;/a&gt; like what colin did.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;http://www.flickr.com/photos/lotso/2182061885/&quot; title=&quot;dsc00205-small by lotso, on Flickr&quot; rel=&quot;nofollow&quot;&gt;&lt;img src=&quot;http://farm3.static.flickr.com/2348/2182061885_71c67d6fe2_o.jpg&quot; width=&quot;240&quot; height=&quot;180&quot; alt=&quot;dsc00205-small&quot; /&gt;&lt;/a&gt;</description>
  <comments>http://lotso.livejournal.com/104504.html</comments>
  <category>linux</category>
  <lj:security>public</lj:security>
  <lj:reply-count>1</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/104440.html</guid>
  <pubDate>Sun, 06 Jan 2008 06:24:57 GMT</pubDate>
  <title>SQL - pgpool-II (Step 2)</title>
  <link>http://lotso.livejournal.com/104440.html</link>
  <description>So, this is step 2 to getting replication + load balancing to work for postgresql.&lt;br /&gt;&lt;br /&gt;I&apos;ve already detailed the 1st step to getting Slony to work in a previous blog. (that was on a development machine/vmware image. When I tried it on the production/slave server, I was faced with some issues which I might elaborate in another post. It all boils down myself shooting my own foot. What to do, it was a time when I wasn&apos;t connected to the internet and thus, no googling privileges.)&lt;br /&gt;&lt;br /&gt;So, here are my experience with pgpool and it&apos;s also a little bit like shooting myself in the foot (again!)&lt;br /&gt;&lt;br /&gt;First off, I started out with using the _wrong_ version of pgpool. The newest version of pgpool-II (note that -&amp;gt; pgpool-II and not pgpool-I) is 2.0.1 and the newest version of pgpool found on the yum mirrors (I&apos;m using centos4/5) was 2.01 (well, the numbers match don&apos;t they?) The only different was the one available on the yum mirrors was that of pgpool-I and not pgpool-II. However, since documentation on pgpool were sparse (I googled everywhere, read all the relevant and NON-relevant mailing list and found nothing much to go on by.)&lt;br /&gt;&lt;br /&gt;It was not until I signed up to the pgpool mailing list (which was very low volume by the way) and interacting with one of the Japanese developer did I find out that I was in-fact using the OLD version of pgpool which was pgpool-I which, unfortunately had the same version as pgpool-II!&lt;br /&gt;(I even downloaded the tarball from pgfoundry[but I _did_ download the _correct_ tarball] and searched through the source to figure out what was happening.)&lt;br /&gt;&lt;br /&gt;By that little(big!) mistake I did, I was tearing my hair out for the past 3+ weeks. (well, I didn&apos;t play with it everyday and in-between my dayjob and such....) However, I did get pgpool-I to work properly with a little tweaking and I could get load-balancing to work, albeit it was not as advertise as in I can&apos;t get it to work without it functioning as replication as well. (of sorts anyway, which was the reason I can&apos;t deploy it as I was using slony)&lt;br /&gt;&lt;br /&gt;So, after I found out my mistake last friday, I started to google for a new RPM of pgpool-II (newest version 2.0.1) but was unable to locate it in any place. The latest RPM I could find was that of version 1.3 which was _too_ old in a sense. (It&apos;s always better to have the latest stable version) So, I had to engineer a way to get a RPM from the tarball. Luckily, the tarball from pgfoundry also contained the pgpool.spec file, which was packaged by Devrim. Unfortunately for my, the spec file was a little old in that it refered to the 2.0 beta1 version. It wasn&apos;t too much of an issue as all it needed was a little hack here and a little hack there. (I was getting bad owner/group permission error which I narrowed down to the .spec file not having valid user/groups.) &lt;br /&gt;&lt;br /&gt;After that was done a rpmbuild -ba pgpool.spec and I got an RPM.&lt;br /&gt;&lt;br /&gt;After that, I just installed it, configured the pgpool.conf and got it up and running as advertised with replication mode off, master slave mode on and load balancing mode on.&lt;br /&gt;&lt;br /&gt;Cool.. I&apos;m rolling this to production on Monday.&lt;br /&gt;&lt;br /&gt;So, this means I&apos;ll have 1x Master (1.7G celeron/768MB ram, 500G Raid1 with ~200GB of data), 1xSlave (1.7G celeron/512MB ram 3x160GB raid0). I still have another box sitting under my desk which has even poorer specs than the above, but I think it&apos;ll work out just fine.&lt;br /&gt;&lt;br /&gt;Cool..Ultra Cool Even!!&lt;br /&gt;&lt;br /&gt;If anyone wants the RPM or the modified spec file, do drop me a line and I&apos;ll post it to you or something.</description>
  <comments>http://lotso.livejournal.com/104440.html</comments>
  <category>linux</category>
  <category>sql</category>
  <lj:mood>cheerful</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>29</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/104119.html</guid>
  <pubDate>Sun, 23 Dec 2007 17:18:23 GMT</pubDate>
  <title>SQL - Slony-I (step 1)</title>
  <link>http://lotso.livejournal.com/104119.html</link>
  <description>Been playing around with some level of replication for Postgresql. Like in all FOSS based software, there is lots of choices to choose from and that, in itself, though a blessing is also a curse. There’s just too many choices! (Both Foss and Non-Foss per se)&lt;br /&gt;&lt;br /&gt;1. Sequoia&lt;br /&gt;2. PgCluster&lt;br /&gt;3. CyberCluster&lt;br /&gt;4. Slony-I&lt;br /&gt;5. PgPool&lt;br /&gt;6. Skytools (this is skype)&lt;br /&gt;&lt;br /&gt;and i believe the list goes on. In any case, my requirements are just 2 I think. (for now anyway)&lt;br /&gt;&lt;br /&gt;i. Replicate only a subset of the tables. (not the entire db)&lt;br /&gt;(AFAIK, pgcluster, while easier to configure is also an entire DB replication solution, which is not what I wanted)&lt;br /&gt;&lt;br /&gt;ii. Connection load balancing to a few read-only slaves (for select queries only)&lt;br /&gt;&lt;br /&gt;Hence, based on the overflowing amount of information of which option to choose, I finally arrived at using slony-I and pgpool and of the two options, I’ve (more or less) already completed the configuration of Slony-I.&lt;br /&gt;&lt;br /&gt;For Slony-I, I made sure that I understood how to do the “old-style” which is by using the cli, before I moved on to doing the rest of the configuration using pgadmin which is way easier.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;i&gt;&lt;u&gt;Slony-I&lt;/u&gt;&lt;/i&gt;&lt;/b&gt; &lt;br /&gt;There are a few caveats when using Slony-I and I’ll list down my experiences when I’m playing with it using both gentoo and centos 4 (this is running in a VM)&lt;br /&gt;&lt;br /&gt;1st off, version 1.2.12 is out from the slony-website but gentoo is still at 1.2.10. The easiest thing to do with this is just to hack the ebuild and change the version from 1.2.10--&amp;gt; 1.2.12 (gentoo bug #143600) and move it to /usr/local/portage.&lt;br /&gt;&lt;br /&gt;So, in that sense, building on gentoo was relatively straightforward and less than 10 min job (excluding compilation)&lt;br /&gt;&lt;br /&gt;But on centos, it’s another matters since there’s no default rpm supplied. Only a src rpm was supplied and not being too utterly familiar with it, (i’ve switched to using gentoo nearly 4/5 years ago as I hated fedora’s upgrade cycle and centos was “supposed” to be server-grade.)&lt;br /&gt;&lt;br /&gt;In anycase, most of the caveats are when dealing with centos. For one, since this is a src.rpm, you have to compile it 1st.&lt;br /&gt;&lt;br /&gt;Hence, you need these additional packages :&lt;br /&gt;&lt;br /&gt;1. bison&lt;br /&gt;2. flex&lt;br /&gt;3. gcc (and all it’s dependencies)&lt;br /&gt;4. rpm-build&lt;br /&gt;5. postgresql-devel&lt;br /&gt;6. docbook-style-dsssl&lt;br /&gt;7. netpbm-progs (and netpbm dependency)&lt;br /&gt;6. (there might be more as I didn’t document it)&lt;br /&gt;&lt;br /&gt;Once you start compiling it, you’ll run into 1 error which is caused by the NAMELEN of the docs. (this is marked as bug &lt;a href=&quot;https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=159382&quot; rel=&quot;nofollow&quot;&gt;#159382&lt;/a&gt; and the solution is to either upgrade to centos 5 (supposed to be fixed by this release. Keyword = supposed) or to hack it. (I chose to hack it)&lt;br /&gt;&lt;br /&gt;depending on where your docbook files are, you can do this&lt;br /&gt;&lt;em&gt;&lt;code&gt;&lt;pre&gt;
cd /usr/share/sgml &amp;&amp; perl -pi.bak -e ‘s/(NAMELEN\s+)44/${1}256/’ ‘find . -type f |xargs grep ’NAMELEN.*44’|sed -e ‘s/:.*//’‘
&lt;/em&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;So, after that is resolved, (which took 1-2 hours w/ scouring net etc.) Then move on to the experimenting stage. I used articles from these few locations :&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;http://slony.info/documentation/&quot; rel=&quot;nofollow&quot;&gt;slony-i official docs&lt;/a&gt;&lt;br /&gt;&lt;a href=&quot;http://odyssi.blogspot.com/2007/10/postgresql-replication-with-slony-i.html&quot; rel=&quot;nofollow&quot;&gt;WhoAmI’s Blog&lt;/a&gt;&lt;br /&gt;&lt;a href=&quot;http://www.onlamp.com/pub/a/onlamp/2004/12/16/slony_install.html?page=2&quot; rel=&quot;nofollow&quot;&gt;OnLamp Article from 2005&lt;/a&gt;&lt;br /&gt;&lt;a href=&quot;http://www.pgadmin.org/archives/pgadmin-support/2007-09/msg00101.php&quot; rel=&quot;nofollow&quot;&gt;Pgadmin Archives&lt;/a&gt;&lt;br /&gt;&lt;a href=&quot;http://www.pgadmin.org/docs/1.8/slony-overview.html&quot; rel=&quot;nofollow&quot;&gt;Pgadmin Docs&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Anyway, a few more caveats with the configuration is.&lt;br /&gt;&lt;br /&gt;1. Ensure you use a .pgpass file for the passwords (chmod go-rwx ~/.pgpass)&lt;br /&gt;&lt;em&gt;&lt;pre&gt;
192.168.10.100:5432:*:postgres:pguserpassword
192.168.10.20:5432:*:postgres:pguserpassword
&lt;/em&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;2. Ensure that you use sane configs for your pg_hba.conf file (use trust/ident authentication 1st just in case, to ensure it’s not due to that if it’s not working)&lt;br /&gt;&lt;br /&gt;3. ensure that the connection string used for slon/slonik also uses the “user=postgres” line.&lt;br /&gt;(notice that this &lt;a href=&quot;http://odyssi.blogspot.com/2007/10/postgresql-replication-with-slony-i.html&quot; rel=&quot;nofollow&quot;&gt;guide&lt;/a&gt; doesn’t have the user to connect as in the slonik shell script. This caused me some headache as I was getting both a password error as well as some “cannot connect admin node xxx issues)&lt;br /&gt;&lt;br /&gt;4. Create the replication using either directly using shellscripts or using pgadmin3. (i followed both the examples from the pgadmin docs as well as the mail I found on the pgadmin mailing list - links provided above, with the exception that I didn’t make it 2 way as in slave&amp;lt;--&amp;gt;master but only master--&amp;gt;slave and slave--&amp;gt;master.)&lt;br /&gt;&lt;br /&gt;5. starting the slon process is as simple as (I used a config file instead)&lt;br /&gt;$cat &amp;gt; slon_master.conf&lt;br /&gt;cluster_name = ‘pgcluster’&lt;br /&gt;conn_info = ‘dbname=testcluster host=192.168.10.20 user=postgres’&lt;br /&gt;^C&lt;br /&gt;&lt;br /&gt;$slon -d4 -f slon_master.conf&lt;br /&gt;&lt;br /&gt;(-d4 to give lots of debug output)&lt;br /&gt;&lt;br /&gt;&lt;u&gt;&lt;b&gt;on the Master DB&lt;/b&gt;&lt;/u&gt;&lt;br /&gt;&lt;em&gt;&lt;pre&gt;
2007-12-24 01:04:12 MYT DEBUG2 syncThread: new sl_action_seq 1 - SYNC 217
2007-12-24 01:04:16 MYT DEBUG2 localListenThread: Received event 10,217 SYNC
2007-12-24 01:04:17 MYT DEBUG2 remoteListenThread_1: queue event 1,195 SYNC
2007-12-24 01:04:17 MYT DEBUG2 remoteListenThread_1: UNLISTEN
2007-12-24 01:04:22 MYT DEBUG2 syncThread: new sl_action_seq 1 - SYNC 218
&lt;/em&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;&lt;b&gt;on the Slave DB&lt;/b&gt;&lt;/u&gt;&lt;br /&gt;&lt;em&gt;&lt;pre&gt;
2007-12-23 22:05:02 MYT DEBUG2 remoteWorkerThread_10: SYNC 227 processing
2007-12-23 22:05:02 MYT DEBUG2 remoteWorkerThread_10: no sets need syncing for this event
2007-12-23 22:05:04 MYT DEBUG2 remoteListenThread_10: queue event 10,228 SYNC
2007-12-23 22:05:04 MYT DEBUG2 remoteWorkerThread_10: Received event 10,228 SYNC
2007-12-23 22:05:04 MYT DEBUG3 calc sync size - last time: 1 last length: 2005 ideal: 29 proposed size: 3
&lt;/em&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;6. BTW, there’s no such need to do a database dump and restore of the tables you want to be replicated. It’s as good to just create the schema w/o any data and start the slon processes. I learned that all my effort to dump and restore the replicated tables just ended up in the drain as slony-I will just truncate the table (this was a command I caught a glimpse of when slon started) and restart from scratch. (i really wonder if this is intended behaviour. What happens when the slon processes goes down? and it seems that it’s quite fragile, so I’ll have to look into that.)&lt;br /&gt;&lt;br /&gt;Next up is to look at pg-pool. That’ll be another fun(?) thing to look at??&lt;br /&gt;&lt;br /&gt;BTW, I’m looking to do the replication to another (low end celeron) box and perhaps just do a raid0 out of 3 drives for greater performance(?) and then pg-pool to load balance it to the raid0 box.&lt;br /&gt;&lt;br /&gt;Build performance and redundancy through multiple un-reliablie boxes eh? The google philosophy. &lt;br /&gt;I’ve got a few low end boxes lying around in the office which can be put to use I suspect.</description>
  <comments>http://lotso.livejournal.com/104119.html</comments>
  <category>linux</category>
  <category>sql</category>
  <lj:security>public</lj:security>
  <lj:reply-count>3</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/103701.html</guid>
  <pubDate>Tue, 18 Dec 2007 19:29:23 GMT</pubDate>
  <title>Gnome-2.20 - Totem Backend Changed to Gstreamer (By Default)</title>
  <link>http://lotso.livejournal.com/103701.html</link>
  <description>This sucks.. it’s 3am and I’m battling with Gnome-2.20 and the new Totem gstreamer backend which is refusing to play nice with RMVB files. (actually, I think it is unable to handle any codecs which is not supported by gstreamer - which is also why there’s the pitfdll plugin - which is not in gentoo’s portage by the way)&lt;br /&gt;&lt;br /&gt;I really like totem as it integrates nicely with gnome (in general) and not to mention that it also is able to play/stream from a smb share unlike the rest of gnome and linux (in general and in my experience anyway). All other options, one has to mount the smb share into linux, or copy the entire file into the system before one can really play it, which totally sucks by the way.&lt;br /&gt;&lt;br /&gt;It also seems that I was stucked using totem-2.16(I can’t remember why) and in totem-2.18, the gentoo people made the default totem backend from xine to gstreamer. (even though xine is still a supported backend based on what I read on the totem website)&lt;br /&gt;&lt;br /&gt;So, going through the internet for “possible” solutions, I finally ended up hacking the ebuild to suit _my_ needs.&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;&lt;em&gt;
 $ diff -Nau /usr/portage/media-video/totem/totem-2.20.1.ebuild /usr/local/portage/media-video/totem/totem-2.20.1.ebuild
--- /usr/portage/media-video/totem/totem-2.20.1.ebuild  2007-11-29 14:06:23.000000000 +0800
+++ /usr/local/portage/media-video/totem/totem-2.20.1.ebuild    2007-12-19 02:55:44.000000000 +0800
@@ -103,7 +103,7 @@
        # use global mozilla plugin dir
        G2CONF=“${G2CONF} MOZILLA_PLUGINDIR=/usr/$(get_libdir)/nsbrowser/plugins”

-       G2CONF=“${G2CONF} --disable-vala --disable-vanity --enable-gstreamer --with-dbus”
+       G2CONF=“${G2CONF} --disable-vala --disable-vanity --enable-xine --disable-gstreamer --with-dbus”

        if use gnome ; then
            G2CONF=“${G2CONF} --disable-gtk --enable-nautilus”

&lt;/code&gt;&lt;/pre&gt;&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;Note : This at least made it able to play rmvb files once again.&lt;br /&gt;&lt;br /&gt;Note 2: You may need to mae a symlink to your win32codecs install location as xine defaults to searching in /usr/lib/codecs (which doesn’t exists in gentoo)</description>
  <comments>http://lotso.livejournal.com/103701.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>1</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/103530.html</guid>
  <pubDate>Sun, 18 Nov 2007 13:10:27 GMT</pubDate>
  <title>SQL - pgadmin can&apos;t do table inheritance</title>
  <link>http://lotso.livejournal.com/103530.html</link>
  <description>So I found out that in some sense, pgadmin is like shooting oneself in the foot. (if you’re not in the know)&lt;br /&gt;&lt;br /&gt;Let’s see how many times I’ve shot myself in the foot.&lt;br /&gt;&lt;br /&gt;1. There was no option for moving indexes to a separate tablespace from within pgadmin (1.8)&lt;br /&gt;--&amp;gt; This can be done using psql -&amp;gt; alter index xxx set tablespace fastspace&lt;br /&gt;&lt;br /&gt;2. There’s no option for making a table to become an inherited table AFTER table creation. (Note that there is an option in the GUI for adding/removing inherit tables, but in mine, it’s greyed out.)&lt;br /&gt;--&amp;gt; Via psql -&amp;gt; alter table footable inherit footable_parent&lt;br /&gt;&lt;br /&gt;hmm.. only 2 times.. (at least that’s how many I can remember right now)</description>
  <comments>http://lotso.livejournal.com/103530.html</comments>
  <category>sql</category>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/103389.html</guid>
  <pubDate>Sat, 17 Nov 2007 05:09:14 GMT</pubDate>
  <title>SQL - Perl DBI - Updating the rows counts</title>
  <link>http://lotso.livejournal.com/103389.html</link>
  <description>I’m syncing my PG database with the main MSSQL DB at a specified interval and I was wondering how many records were being deleted/inserted at every hour so  that i can get a feel of how much latency there is between the main SQL server and my data mart.&lt;br /&gt;&lt;br /&gt;My solution to pull data from mssql and insert into PG is based on Perl-DBI. Initially, I was wondering how I can get the rowcount (rows affected by a query) to be inserted into a log table. AFAIK, there isn’t a @@rowcount (mssql feature) in PG but there is a GET DIAGNOSTICS = ROW COUNT feature using pl/pgsql (which I use by the way when I write pl/pgsql) but since this was using perl-DBI, this feature wasn’t available. Hence the problem.&lt;br /&gt;&lt;br /&gt;I researched it a bit and found out that Perl-DBI provides the metadata of the number of rows affected by NON-SELECT statements.&lt;br /&gt;&lt;br /&gt;Hence, I set out to incorporate that into my script. (turns out that it wasn’t all that difficult)&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;em&gt;&lt;code&gt;
  # We have a basic 4 SQL statement to execute.
  # DELETE / INSERT / UPDATE / TRUNCATE
  my $query0 = “TRUNCATE TABLE $table_name_loading”;
  my $query1 = “DELETE FROM $table_name
                WHERE $unique_id in
                (SELECT $unique_id from $table_name_loading)”;
  my $query2 = “INSERT INTO $table_name SELECT * FROM $table_name_loading”;
  my $query3 = “UPDATE log_sync SET last_sync=?,
                record_update_date_time=current_timestamp
                WHERE table_name=?
                AND db_name = ?”;
  my $query4 = “INSERT INTO log_update(job_name, table_name, from_date, to_date, rows_deleted, rows_inserted)
                VALUES (‘mssql_2_pg’,?, ?, ?, ?, ?)”;
  my $query5 = “TRUNCATE TABLE $table_name_loading”;
#DBI-&amp;gt;trace(1);

  # The queries/SQL are wrapped into an EVAL because we
  # expect that these queries MAY fail due to duplicate Primary Keys
  # Note that these are all running as 1 transaction. If anyone failed,
  # we will call the errorhandler, rollback the changes, send email
  # and quit
  eval {
    print “Executing CLEANUP\n” if ($verbose);
    $sth_pg = $dbh_pg-&amp;gt;prepare($query0) or die “prepare failed $DBI::errstr”;
    $sth_pg-&amp;gt;execute();

    print “Executing DELETE\n” if ($verbose);
    #$sth_pg = $dbh_pg-&amp;gt;prepare($query1) or die “prepare failed $DBI::errstr”;
    #$sth_pg-&amp;gt;execute();
&lt;i&gt;&lt;b&gt;    my $del_rows = $dbh_pg-&amp;gt;do($query1) or die “prepare failed $DBI::errstr”;
   if ($del_rows != 0)
   {
     print “Number of rows deleted: ” . $del_rows . “\n”;
   } else {
     $del_rows = 0;
   }
&lt;/b&gt;&lt;/i&gt;
    print “Executing INSERT\n” if ($verbose);
    #$sth_pg = $dbh_pg-&amp;gt;prepare($query2) or die “prepare failed $DBI::errstr”;
    #$sth_pg-&amp;gt;execute();
    my $ins_rows = $dbh_pg-&amp;gt;do($query2) or die “prepare failed $DBI::errstr”;
    if ($ins_rows != 0)
    {
      print “Number of rows inserted: ” . $ins_rows . “\n”;
    } else {
      $ins_rows = 0;
    }

    print “Executing UPDATE\n” if($verbose);
    $sth_pg = $dbh_pg-&amp;gt;prepare($query3) or die “prepare failed $DBI::errstr”;
    $sth_pg-&amp;gt;execute($to_datetime, $table_name, $mssql_default_db);

    print “Executing INSERT INTO LOG\n” if($verbose);
    $sth_pg = $dbh_pg-&amp;gt;prepare($query4) or die “prepare failed $DBI::errstr”;
    $sth_pg-&amp;gt;execute($table_name, $from_datetime, $to_datetime, $del_rows, $ins_rows);

    print “Executing TRUNCATE\n” if ($verbose);
    $sth_pg = $dbh_pg-&amp;gt;prepare($query5) or die “prepare failed $DBI::errstr”;
    $sth_pg-&amp;gt;execute();
  };

  errorhandler(“dbh_pg”);
#  $dbh_pg-&amp;gt;rollback;
# If we got this far, then we commit the transaction
$dbh_pg-&amp;gt;commit;
}
&lt;/pre&gt;&lt;/em&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;A portion of the code as highlighted above needed to be done because in the log_update table, I’ve defined the rows_inserted/deleted as integers and Perl-DBI, in it’s wisdom, output &lt;b&gt;0E0&lt;/b&gt; when there is 0 (zero) rows affected by the query, for whatever reason. Hence those lines were added to ensure that 0 is outputed instead of 0E0.</description>
  <comments>http://lotso.livejournal.com/103389.html</comments>
  <category>linux</category>
  <category>sql</category>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/103128.html</guid>
  <pubDate>Sat, 03 Nov 2007 08:47:53 GMT</pubDate>
  <title>SQL - Coolness Factor X</title>
  <link>http://lotso.livejournal.com/103128.html</link>
  <description>This is really cool and I can already see some &lt;s&gt;nasty &lt;/s&gt; nifty stuffs to do with this nugget.&lt;br /&gt;&lt;br /&gt;My objective is to see if I can do some kind of DB link from within &lt;a href=&quot;http://www.postgresql.org&quot; rel=&quot;nofollow&quot;&gt; postgresql &lt;/a&gt;  to a Miscrosoft(tm) SQL Server instance.&lt;br /&gt;&lt;br /&gt;There are a few method of implementation, namely&lt;br /&gt;&lt;br /&gt;1. &lt;a href=&quot;http://pgfoundry.org/projects/dbi-link/&quot; rel=&quot;nofollow&quot;&gt;dbi-link&lt;/a&gt;&lt;br /&gt;2. &lt;a href=&quot;http://developer.postgresql.org/cvsweb.cgi/pgsql/contrib/dblink/&quot; rel=&quot;nofollow&quot;&gt;dblink&lt;/a&gt;&lt;br /&gt;3. &lt;a href=&quot;http://pgfoundry.org/projects/dblink-tds/&quot; rel=&quot;nofollow&quot;&gt;dblink-tds&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;In this blog, I will talk about using option 3 which is basically a method for getting access to a MSSQL or a Sybase DB instance from within PG.&lt;br /&gt;&lt;br /&gt;The install process is quite simple (at least on gentoo, but I’ve yet to determine how to go about installing it on Centos, which is the deployment/target server)&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;em&gt;&lt;pre&gt;
make
make install

       query
        ------------
        dblink_tds(text, text, text, text) RETURNS setof record
                - returns a set of results from remote query (can be any kind of SQL query);
                - arguments are:
                - 1 text: SQL command string
                - 2 text: Name of the server entry in freetds.conf
                - 3 text: Username used to connect to MS SQL
                - 4 text: Password used to connect to MS SQL

        dblink_tds(text, text, text, text, int) RETURNS setof record
                - returns a set of results from remote query (can be any kind of SQL query);
                - arguments are:
                - 1 text: SQL command string
                - 2 text: Name of the server entry in freetds.conf
                - 3 text: Username used to connect to MS SQL
                - 4 text: Password used to connect to MS SQL
                - 5 int: Port number used to connect to MS SQL (default is 1433)

        dblink_tds(text, text, text, text, int, text) RETURNS setof record
                - returns a set of results from remote query (can be any kind of SQL query);
                - arguments are:
                - 1 text: SQL command string
                - 2 text: Name of the server entry in freetds.conf
                - 3 text: Username used to connect to MS SQL
                - 4 text: Password used to connect to MS SQL
                - 5 int: Port number used to connect to MS SQL
                - 6 text: Complete path to freetds.conf file (default is /etc/freetds/freetds.co
nf)
&lt;/code&gt;&lt;/em&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The only drawback from doing a source install and not from an ebuild is that the default install location is in /usr/local/pgsql but my postgresql library location is in /usr/lib/pgsql. Thus, i made a symbolic link as a hack.&lt;br /&gt;&lt;br /&gt;Anyway.. post installation, I did a couple of testing and I’m happy with the results.&lt;br /&gt;&lt;br /&gt;One example usage is&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;em&gt;&lt;pre&gt;
SELECT 
a.id, a.famid, a.dcm,b.supplier 
FROM d_part a
INNER JOIN (
SELECT * 
FROM dblink_tds($$select famid, dcm, supplier_name 
    FROM database.dbo.supplier_lookup b$$,$$NeuroXP$$,$$sa$$,$$11111$$) AS 
(famid text,dcm text, supplier_name text)
) b
on a.famid = b.famid
and a.dcm = b.dcm
WHERE batchid = ‘2002’;

&lt;/code&gt;&lt;/em&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;take note that you have to put the AS (..) condition else it’ll spew out errors relating to wrong record type.&lt;br /&gt;&lt;br /&gt;my freetds version is 0.64 and this is what I’ve placed inside as definition to the connection&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;em&gt;&lt;pre&gt;
[NeuroXP]
        host = 172.16.124.128
        port = 1433
        tds version = 8.0

&lt;/code&gt;&lt;/em&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Thanks to davidfetter in #postgresql again</description>
  <comments>http://lotso.livejournal.com/103128.html</comments>
  <category>linux</category>
  <category>sql</category>
  <lj:security>public</lj:security>
  <lj:reply-count>1</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/102867.html</guid>
  <pubDate>Sat, 27 Oct 2007 06:26:54 GMT</pubDate>
  <title>SQL - Error!! Take Evasive Action (and not the other way around)</title>
  <link>http://lotso.livejournal.com/102867.html</link>
  <description>As I look more into the previous d_refresh(tblname text) function, the more I see that the &lt;a href=&quot;http://lotso.livejournal.com/102351.html&quot;&gt;checks for duplicates&lt;/a&gt; are taking up a whole chunk of server/cpu/disk time. (most important is disk IO time, even though the time it spends there are just seconds, it is seconds too long to check for duplicates IF there aren’t any to begin with!)&lt;br /&gt;&lt;br /&gt;So, I embarked to make the function better. &lt;br /&gt;&lt;br /&gt;In this case, I chose to NOT follow the adage, fix it &lt;i&gt;&lt;u&gt;&lt;b&gt;before&lt;/b&gt;&lt;/u&gt;&lt;/i&gt; it breaks. This time around, I chose to &lt;b&gt;“Fix it IF and ONLY if it breaks”&lt;/b&gt;. This will be a needed relief for the server.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;So, I broke down the previous function and included an EXCEPTION check for unique_contraints. (I was looking through the &lt;a href=&quot;http://www.postgresql.org/docs/8.2/interactive/errcodes-appendix.html&quot; rel=&quot;nofollow&quot;&gt;Postgres Docs&lt;/a&gt; looking for error codes for duplicate primary key issues but can’t find any; at that time, I didn’t know that unique_contrainst == duplicate primary key until I saw this from &lt;a href=&quot;http://www.varlena.com/GeneralBits/106.php&quot; rel=&quot;nofollow&quot;&gt;Varlena.com&lt;/a&gt;)&lt;br /&gt;&lt;br /&gt;Hence, curently the function is broken down to 2 different segments. &lt;br /&gt;&lt;br /&gt;1. Normal Insertion, (delete/insert)&lt;br /&gt;2. Delete duplicate Primary key. (this is different from #1 in ways which I can’t explain properly in writing w/o much effort. So, just trust me, its different.) &lt;br /&gt;&lt;br /&gt;&lt;u&gt;Function 1.&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;em&gt;&lt;code&gt;
CREATE OR REPLACE FUNCTION d_refresh(tblname text)
  RETURNS void AS
$BODY$

DECLARE
last_r timestamp;
r_interval interval;
del_qry text;
ins_qry text;
del_stime timestamp;
del_etime timestamp;
ins_stime timestamp;
ins_etime timestamp;
max_time timestamp;
del_rows integer; 
ins_rows integer; 

tblname text;
del_job_tblname text := job_tblname || ‘_delete’;


BEGIN
  SELECT last_refreshed, refresh_interval, sql_delete, sql_insert 
  INTO last_r, r_interval, del_qry, ins_qry
  FROM d_log 
  WHERE table_name = tblname;

select last_sync 
into max_time
from sync_log where table_name = tblname

IF (last_r+r_interval) &amp;lt; max_time THEN

ins_qry := replace(ins_qry,‘fromdate’,quote_literal(last_r));
ins_qry := replace(ins_qry,‘todate’,quote_literal(last_r+r_interval));

del_qry := replace(del_qry,‘fromdate’,quote_literal(last_r));
del_qry := replace(del_qry,‘todate’,quote_literal(last_r+r_interval));

   del_stime := timeofday();
    execute del_qry;
    del_etime := timeofday();

    GET DIAGNOSTICS del_rows = ROW_COUNT;
    

    ins_stime := timeofday();
        BEGIN
	  execute ins_qry;
     	  EXCEPTION WHEN UNIQUE_VIOLATION THEN
	    execute d_refresh_delete(del_job_tblname);
	    execute ins_qry;
	END;
    ins_etime := timeofday();
 
    GET DIAGNOSTICS ins_rows = ROW_COUNT;

  UPDATE d_log 
  SET    SET last_refreshed = last_r + r_interval,
    record_update_date_time =  now(),
    delete_time = del_etime - del_stime,
    insert_time = ins_etime - ins_stime,
    rows_deleted = del_rows,
    rows_inserted = ins_rows
    WHERE job_table_name = job_tblname;
END IF;

RETURN;
END;
$BODY$
  LANGUAGE ‘plpgsql’ VOLATILE;
&lt;/pre&gt;&lt;/em&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;u&gt;Function 2&lt;/u&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;em&gt;&lt;code&gt;
CREATE OR REPLACE FUNCTION d_refresh_delete(tblname text)
  RETURNS void AS
$BODY$

DECLARE
last_r timestamp;
r_interval interval;
del_qry text;
ins_qry text;
del_stime timestamp;
del_etime timestamp;
ins_stime timestamp;
ins_etime timestamp;
max_time timestamp;
del_rows integer; 
ins_rows integer; 

tblname text;
base_tblname text := replace(job_tblname,‘_delete’,‘’);

BEGIN
   
   SELECT table_name, sql_delete, sql_insert 
  INTO tblname, del_qry, ins_qry
  FROM d_log 
  WHERE job_table_name = job_tblname;

  SELECT last_refreshed, refresh_interval
  INTO last_r, r_interval
  FROM d_log 
  WHERE job_table_name = base_tblname;


    ins_qry := replace(ins_qry,‘fromdate’,quote_literal(last_r));
    ins_qry := replace(ins_qry,‘todate’,quote_literal(last_r+r_interval));

    del_qry := replace(del_qry,‘fromdate’,quote_literal(last_r));
    del_qry := replace(del_qry,‘todate’,quote_literal(last_r+r_interval));

    del_stime := timeofday();
    execute del_qry;
    del_etime := timeofday();
    GET DIAGNOSTICS del_rows = ROW_COUNT;

    ins_stime := timeofday();
    execute ins_qry;
    ins_etime := timeofday();
    GET DIAGNOSTICS ins_rows = ROW_COUNT;

    UPDATE d_log 
    SET last_refreshed = last_r + r_interval,
    record_update_date_time =  now(),
    delete_time = del_etime - del_stime,
    insert_time = ins_etime - ins_stime,
    rows_deleted = del_rows,
    rows_inserted = ins_rows
    WHERE job_table_name = job_tblname;

RETURN;
END;
$BODY$
  LANGUAGE ‘plpgsql’ VOLATILE;
&lt;/pre&gt;&lt;/em&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;So.. when I call d_refresh(‘table_name’)&lt;br /&gt;and a duplicate primary key condition occurs, it will call the d_refresh_delete(‘table_name_delete’) w/ the table name concatenated with a “_delete”.&lt;br /&gt;&lt;br /&gt;In the d_refresh_delete function, it will first strip the “_delete” from the input tablename and use that to obtain the last_refresh and the refresh_interval from the base table so that it will use the same time/date/interval where the error occurs and then remove the duplicate into another duplicate table for further review.&lt;br /&gt;&lt;br /&gt;Once finished, it will return the control back to the original originating function and continue again with the insert execution and the normal flow.&lt;br /&gt;&lt;br /&gt;I’m pretty happy with this as it greatly reduces the amount of disk IO and server time. Don’t do useless stuffs right?&lt;br /&gt;&lt;br /&gt;Plenty Cool.</description>
  <comments>http://lotso.livejournal.com/102867.html</comments>
  <category>linux</category>
  <category>sql</category>
  <lj:security>public</lj:security>
  <lj:reply-count>1</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/102440.html</guid>
  <pubDate>Sat, 20 Oct 2007 15:14:59 GMT</pubDate>
  <title>SQL - My plpgSQL Function.. checking for maxtime</title>
  <link>http://lotso.livejournal.com/102440.html</link>
  <description>This is just an update on the &lt;a href=&quot;http://lotso.livejournal.com/102114.html&quot;&gt;function d_refresh(tblname text) &lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I wanted to ensure that the function will not run when the last sync time of the DB to the master DB is less than the last_refreshed time + refresh_interval time. So, I used an IF to wrap it up.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;code&gt;&lt;pre&gt;
CREATE OR REPLACE FUNCTION d_refresh(tblname text)
  RETURNS void AS
$BODY$

DECLARE
last_r timestamp;
r_interval interval;
del_qry text;
ins_qry text;
del_stime timestamp;
del_etime timestamp;
ins_stime timestamp;
ins_etime timestamp;
max_time timestamp;

BEGIN
  SELECT last_refreshed, refresh_interval, sql_delete, sql_insert 
  INTO last_r, r_interval, del_qry, ins_qry
  FROM d_log 
  WHERE table_name = tblname;

select last_sync 
into max_time
from sync_log where table_name = tblname

IF (last_r+r_interval) &amp;lt; maxtime THEN

ins_qry := replace(ins_qry,‘fromdate’,quote_literal(last_r));
ins_qry := replace(ins_qry,‘todate’,quote_literal(last_r+r_interval));

del_qry := replace(del_qry,‘fromdate’,quote_literal(last_r));
del_qry := replace(del_qry,‘todate’,quote_literal(last_r+r_interval));

  del_stime := timeofday();
  execute del_qry;
  del_etime := timeofday();

  ins_stime := timeofday();
  execute ins_qry;
  ins_etime := timeofday();


  UPDATE d_log 
  SET last_refreshed = last_r + r_interval,
  record_update_date_time =  now(),
  delete_time = del_etime - del_stime,
  insert_time = ins_etime - ins_stime
  WHERE table_name = tblname;
END IF;

RETURN;
END;
$BODY$
  LANGUAGE ‘plpgsql’ VOLATILE;
&lt;/em&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Cool...</description>
  <comments>http://lotso.livejournal.com/102440.html</comments>
  <category>linux</category>
  <category>sql</category>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/102351.html</guid>
  <pubDate>Sat, 20 Oct 2007 14:42:04 GMT</pubDate>
  <title>SQL - Optimising Delete</title>
  <link>http://lotso.livejournal.com/102351.html</link>
  <description>I admit, my sql foo still has to be honed.&lt;br /&gt;&lt;br /&gt;My &lt;s&gt;previous&lt;/s&gt; problem was  that the previous entry’s profiling has found that I spent too much time on the delete portion of the query as opposed to the insert.&lt;br /&gt;&lt;br /&gt;(what I was basically trying to do was just a simple DELETE then INSERT)&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;code&gt;&lt;pre&gt;
     table_name      |   delete_time   |   insert_time
---------------------+-----------------+-----------------
z                  | 00:01:14.424943 | 00:00:02.622862
&lt;/em&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;the delete query was this..&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;code&gt;&lt;pre&gt;
select * from d_trz where exists(
select 1
from z
where z.record_update_date_time &amp;gt;= ‘2007-08-08 18:00:00’
and z.record_update_date_time   &amp;lt;  &apos;2007-08-08 18:01:00&apos;
and d_trz.id= z.id 
and d_trz.hid = z.hid 
and d_trz.start_date_time = z.start_date_time 
and d_trz.type = z.type 
and d_trz.phase_id = z.phase_id
)

the explain was giving me this:

Seq Scan on d_trz  (cost=0.00..414862.31 rows=21114 width=1611)&quot;
  Filter: (subplan)&quot;
  SubPlan&quot;
    -&amp;gt;  Index Scan using idx_trz_uptime on z  (cost=0.00..9.71 rows=1 width=0)“
          Index Cond: ((record_update_date_time &amp;gt;= ‘2007-08-08 18:00:00’::timestamp without time zone) AND (record_update_date_time &amp;lt; &apos;2007-08-08 18:01:00&apos;::timestamp without time zone))&quot;
          Filter: ((($0)::text = (id)::text) AND ($1 = hid) AND ($2 = start_date_time) AND ($3 = type) AND ($4 = phase_id))&quot;
&lt;/em&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Performance really sucked when the number of rows in that table increases.&lt;br /&gt;&lt;br /&gt;So, I went googling, but turned up nothing which will help me optimise the query. So.. Off I went to IRC and asked the question.&lt;br /&gt;...&lt;br /&gt;...&lt;br /&gt;...&lt;br /&gt;&lt;later&gt;&lt;br /&gt;&lt;br /&gt;I got an answer (again) from &lt;a href=&quot;http://a-kretschmer.de/&quot; rel=&quot;nofollow&quot;&gt;akretschmer&lt;/a&gt;to try the query in an alternate way..&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;code&gt;&lt;pre&gt;

select * from d_trz where 
(id, hid, start_date_time, type, phase_id) in 
(select id, hid, start_date_time, type, phase_id from z
where z.record_update_date_time &amp;gt;= ‘2007-08-08 18:00:00’
and z.record_update_date_time   &amp;lt;  &apos;2007-08-08 18:01:00&apos;) 

Nested Loop  (cost=9.71..18.05 rows=1 width=1611) (actual time=66.683..70.852 rows=82 loops=1)&quot;
  -&amp;gt;  HashAggregate  (cost=9.71..9.72 rows=1 width=30) (actual time=66.634..67.047 rows=254 loops=1)”
        -&amp;gt;  Index Scan using idx_trz_uptime on z  (cost=0.00..9.70 rows=1 width=30) (actual time=0.107..9.729 rows=5170 loops=1)“
              Index Cond: ((record_update_date_time &amp;gt;= ‘2007-08-08 18:00:00’::timestamp without time zone) AND (record_update_date_time &amp;lt; &apos;2007-08-08 18:01:00&apos;::timestamp without time zone))&quot;
  -&amp;gt;  Index Scan using d_trz_pkey on d_trz  (cost=0.00..8.30 rows=1 width=1611) (actual time=0.009..0.010 rows=0 loops=254)”
        Index Cond: (((d_trz.id)::text = (z.id)::text) AND (d_trz.hid = z.hid) AND (d_trz.start_date_time = z.start_date_time) AND (d_trz.type = z.type) AND (d_trz.phase_id = z.phase_id))“
Total runtime: 71.182 ms”
&lt;/em&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Many thanks  again to &lt;a href=&quot;http://a-kretschmer.de/&quot; rel=&quot;nofollow&quot;&gt;akretschmer&lt;/a&gt;</description>
  <comments>http://lotso.livejournal.com/102351.html</comments>
  <category>linux</category>
  <category>sql</category>
  <lj:security>public</lj:security>
  <lj:reply-count>1</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/102114.html</guid>
  <pubDate>Sat, 20 Oct 2007 09:25:55 GMT</pubDate>
  <title>SQL - Profiling a SQL function</title>
  <link>http://lotso.livejournal.com/102114.html</link>
  <description>It’s not really rocket science, but I wanted to know what was taking so long on my function. I’m not sure if it’s the time taken to delete (because the explain plan keeps doing a sequential scan as opposed to an index scan; then again the table is still very small and it’s now only 12K rows)&lt;br /&gt;&lt;br /&gt;So.. I first proceeded to put in some “timers” in the function. However, (since I didn’t know better) I used current_timestamp and now().&lt;br /&gt;&lt;br /&gt;The function looks something like this :-&lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;code&gt;&lt;pre&gt;
CREATE OR REPLACE FUNCTION d_refresh(tblname text)
  RETURNS void AS
$BODY$

DECLARE
last_r timestamp;
r_interval interval;
del_qry text;
ins_qry text;
del_stime timestamp;
del_etime timestamp;
ins_stime timestamp;
ins_etime timestamp;

BEGIN
  SELECT last_refreshed, refresh_interval, sql_delete, sql_insert 
  INTO last_r, r_interval, del_qry, ins_qry
  FROM d_log 
  WHERE table_name = tblname;

ins_qry := replace(ins_qry,‘fromdate’,quote_literal(last_r));
ins_qry := replace(ins_qry,‘todate’,quote_literal(last_r+r_interval));

del_qry := replace(del_qry,‘fromdate’,quote_literal(last_r));
del_qry := replace(del_qry,‘todate’,quote_literal(last_r+r_interval));

  del_stime := now();
  execute del_qry;
  del_etime := now();

  ins_stime := now();
  execute ins_qry;
  ins_etime := now();


  UPDATE d_log 
  SET last_refreshed = last_r + r_interval,
  record_update_date_time =  now(),
  delete_time = del_etime - del_stime,
  insert_time = ins_etime - ins_stime
  WHERE table_name = tblname;

RETURN;
END;
$BODY$
  LANGUAGE ‘plpgsql’ VOLATILE;
&lt;/em&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;But I then found out that the time returned was actually 0. Now, that most certainly can’t be right since the query (insert/delete) takes approx between 5 to 19 secs.&lt;br /&gt;&lt;br /&gt;I pinged some guys on IRC #postgres and half an hour later, I found out my problem. Note that I used now() above. So, according to &lt;a href=&quot;http://a-kretschmer.de/&quot; rel=&quot;nofollow&quot;&gt;akretschmer&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;
[akretschmer] now() and current_timestamp returns the start-timestamp of the current transaction
[akretschmer] timeofday() returns the exact time
&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;so.. with that, the above becomes &lt;br /&gt;&lt;br /&gt;&lt;em&gt;&lt;code&gt;&lt;pre&gt;
CREATE OR REPLACE FUNCTION d_refresh(tblname text)
  RETURNS void AS
$BODY$

DECLARE
last_r timestamp;
r_interval interval;
del_qry text;
ins_qry text;
del_stime timestamp;
del_etime timestamp;
ins_stime timestamp;
ins_etime timestamp;

BEGIN
  SELECT last_refreshed, refresh_interval, sql_delete, sql_insert 
  INTO last_r, r_interval, del_qry, ins_qry
  FROM d_log 
  WHERE table_name = tblname;

ins_qry := replace(ins_qry,‘fromdate’,quote_literal(last_r));
ins_qry := replace(ins_qry,‘todate’,quote_literal(last_r+r_interval));

del_qry := replace(del_qry,‘fromdate’,quote_literal(last_r));
del_qry := replace(del_qry,‘todate’,quote_literal(last_r+r_interval));

  del_stime := timeofday();
  execute del_qry;
  del_etime := timeofday();

  ins_stime := timeofday();
  execute ins_qry;
  ins_etime := timeofday();


  UPDATE d_log 
  SET last_refreshed = last_r + r_interval,
  record_update_date_time =  now(),
  delete_time = del_etime - del_stime,
  insert_time = ins_etime - ins_stime
  WHERE table_name = tblname;

RETURN;
END;
$BODY$
  LANGUAGE ‘plpgsql’ VOLATILE;
&lt;/em&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;and I get &lt;br /&gt;&lt;code&gt;&lt;pre&gt;&lt;em&gt;

MyDB=&amp;gt; select table_name, delete_time, insert_time from denorm_log where 
table_name = ‘z’;
     table_name      |   delete_time   |   insert_time
-----------------------+-----------------+-----------------
z                         | 00:00:02.105404 | 00:00:08.312243
&lt;/em&gt;&lt;/code&gt;&lt;/pre&gt;</description>
  <comments>http://lotso.livejournal.com/102114.html</comments>
  <category>linux</category>
  <category>sql</category>
  <lj:security>public</lj:security>
  <lj:reply-count>3</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/101633.html</guid>
  <pubDate>Sun, 07 Oct 2007 13:20:40 GMT</pubDate>
  <title>string_to_array and getting a function to return table column_names</title>
  <link>http://lotso.livejournal.com/101633.html</link>
  <description>It’s true, I’m slow when it comes to PG in terms of writing plpgsql functions. I blame it on a lack of ample documentation. Seriously, when it comes to PG’s plpgsql documentation, it’s really sparse. I know there’s the official documentation and all, but to me, it’s not nearly as useful because it lacks examples and that sucks in a truly big way.&lt;br /&gt;&lt;br /&gt;The only book I can find on PG is (what I have) a 1st edition PostgreSQL written by Korry Douglas and Susan Douglas. And that too has like perhaps a little over 30pages worth of stuffs on plpgsql docs.&lt;br /&gt;&lt;br /&gt;In the Microsoft world, and the MySQL world and Oracle, there’s tons of books on their stored procedure. I myself have a few of those in my cupboard.&lt;br /&gt;&lt;br /&gt;Anwyay, I digres. The point of this post is to let the reader know how to get all the column_names for a particular table returned as a string so that this can be used for perl-dbi functions to pull from mssql-&amp;gt;pgsql automatically so that I don’t have to re-write the query to add additional columns when additional columns are added into the tables. This will make maintenance easier.&lt;br /&gt;&lt;br /&gt;There are 2 ways of doing this. I went with the more complicated way 1st. (mainly because I didn’t know there was a simpler way) and I was banging my head against the wall for the better part of the day. (oh.. I did take an afternoon nap for 3 hours when my head hurt enough.)&lt;br /&gt;&lt;br /&gt;Here’s the 1st one.&lt;br /&gt;&lt;pre&gt;&lt;em&gt;&lt;code&gt;

CREATE OR REPLACE FUNCTION select_columns(tablename text) RETURNS text as $$
DECLARE
   sql_str text;
   qry text;
BEGIN
	for sql_str in 
        select attname from pg_class 
	join pg_attribute 
	on pg_class.oid = pg_attribute.attrelid
	join pg_namespace
	on pg_namespace.oid = pg_class.relnamespace
	where relname = tablename
	and nspname = ‘myschema
	and attnum &amp;gt; 0
	LOOP
	qry := coalesce(qry || ’,‘,’‘) || sql_str;
	END LOOP;

   RETURN qry;
END;
$$
LANGUAGE plpgsql;
&lt;/pre&gt;&lt;/em&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;the &lt;br /&gt;&lt;pre&gt;&lt;em&gt;&lt;code&gt;
qry := coalesce(qry || ’,‘,’’) || sql_str;
&lt;/pre&gt;&lt;/em&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;is cool because it automatically strips out the comma from (before) the 1st column_name. (because in the original declaration, the qry is NULL)&lt;br /&gt;&lt;br /&gt;This trick wouldn’t work if we had originally declared qry as &lt;br /&gt;&lt;pre&gt;&lt;em&gt;&lt;code&gt;
qry text:=‘’;
&lt;/pre&gt;&lt;/em&gt;&lt;/code&gt;&lt;br /&gt;which makes qry not NULL&lt;br /&gt;&lt;br /&gt;2nd method is definitely simpler and easier to be used and understood.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;em&gt;&lt;code&gt;
SELECT array_to_string(array(SELECT attname FROM pg_class
JOIN pg_attribute
ON pg_class.oid = pg_attribute.attrelid
JOIN pg_namespace
ON pg_namespace.oid = pg_class.relnamespace
WHERE relname = tablename
AND nspname = ‘myschema
AND attnum &amp;gt; 0), ’,‘);
&lt;/pre&gt;&lt;/em&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;I’ll be using the 2nd incarnation as my choice of query string.&lt;br /&gt;&lt;br /&gt;Thanks to depesz in #Postgresql IRC for the pointers.</description>
  <comments>http://lotso.livejournal.com/101633.html</comments>
  <category>linux</category>
  <category>sql</category>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/101583.html</guid>
  <pubDate>Sat, 06 Oct 2007 16:37:40 GMT</pubDate>
  <title>Extending LVM...</title>
  <link>http://lotso.livejournal.com/101583.html</link>
  <description>mrpotato ~ # pvcreate -t -v /dev/sdb1&lt;br /&gt;  Test mode: Metadata will NOT be updated.&lt;br /&gt;    Set up physical volume for “/dev/sdb1” with 35551398 available sectors&lt;br /&gt;    Zeroing start of device /dev/sdb1&lt;br /&gt;  Physical volume “/dev/sdb1” successfully created&lt;br /&gt;    Test mode: Wiping internal cache&lt;br /&gt;    Wiping internal VG cache&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;mrpotato ~ # pvscan&lt;br /&gt;  PV /dev/sda1   VG storage   lvm2 [4.88 GB / 0    free]&lt;br /&gt;  PV /dev/sda4   VG storage   lvm2 [4.26 GB / 0    free]&lt;br /&gt;  PV /dev/sdc1   VG storage   lvm2 [16.95 GB / 32.00 MB free]&lt;br /&gt;  PV /dev/sdb1                lvm2 [16.95 GB]&lt;br /&gt;  Total: 4 [43.03 GB] / in use: 3 [26.08 GB] / in no VG: 1 [16.95 GB]&lt;br /&gt;&lt;br /&gt;mrpotato ~ # pvdisplay&lt;br /&gt;  --- Physical volume ---&lt;br /&gt;  PV Name               /dev/sda1&lt;br /&gt;  VG Name               storage&lt;br /&gt;  PV Size               4.88 GB / not usable 0&lt;br /&gt;  Allocatable           yes (but full)&lt;br /&gt;  PE Size (KByte)       4096&lt;br /&gt;  Total PE              1249&lt;br /&gt;  Free PE               0&lt;br /&gt;  Allocated PE          1249&lt;br /&gt;  PV UUID               8sI5z7-aaYp-mj1l-6q84-RJ04-6Xzq-1EydOK&lt;br /&gt;&lt;br /&gt;  --- Physical volume ---&lt;br /&gt;  PV Name               /dev/sda4&lt;br /&gt;  VG Name               storage&lt;br /&gt;  PV Size               4.26 GB / not usable 0&lt;br /&gt;  Allocatable           yes (but full)&lt;br /&gt;  PE Size (KByte)       4096&lt;br /&gt;  Total PE              1090&lt;br /&gt;  Free PE               0&lt;br /&gt;  Allocated PE          1090&lt;br /&gt;  PV UUID               m2pnY2-GQeg-Ajmo-NQ3W-6gpp-B41K-XPYIMi&lt;br /&gt;&lt;br /&gt;  --- Physical volume ---&lt;br /&gt;  PV Name               /dev/sdc1&lt;br /&gt;  VG Name               storage&lt;br /&gt;  PV Size               16.95 GB / not usable 0&lt;br /&gt;  Allocatable           yes&lt;br /&gt;  PE Size (KByte)       4096&lt;br /&gt;  Total PE              4338&lt;br /&gt;  Free PE               8&lt;br /&gt;  Allocated PE          4330&lt;br /&gt;  PV UUID               kuWduP-PURj-u7mZ-e2ws-Cq6F-S2tc-aEGk6e&lt;br /&gt;&lt;br /&gt;  --- NEW Physical volume ---&lt;br /&gt;  PV Name               /dev/sdb1&lt;br /&gt;  VG Name&lt;br /&gt;  PV Size               16.95 GB&lt;br /&gt;  Allocatable           NO&lt;br /&gt;  PE Size (KByte)       0&lt;br /&gt;  Total PE              0&lt;br /&gt;  Free PE               0&lt;br /&gt;  Allocated PE          0&lt;br /&gt;  PV UUID               X96Ljl-J4Sv-GTSA-bznp-rQ8c-5Fj7-oUwu53&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;mrpotato ~ # vgdisplay&lt;br /&gt;  --- Volume group ---&lt;br /&gt;  VG Name               storage&lt;br /&gt;  System ID&lt;br /&gt;  Format                lvm2&lt;br /&gt;  Metadata Areas        3&lt;br /&gt;  Metadata Sequence No  14&lt;br /&gt;  VG Access             read/write&lt;br /&gt;  VG Status             resizable&lt;br /&gt;  MAX LV                0&lt;br /&gt;  Cur LV                1&lt;br /&gt;  Open LV               1&lt;br /&gt;  Max PV                0&lt;br /&gt;  Cur PV                3&lt;br /&gt;  Act PV                3&lt;br /&gt;  VG Size               26.08 GB&lt;br /&gt;  PE Size               4.00 MB&lt;br /&gt;  Total PE              6677&lt;br /&gt;  Alloc PE / Size       6669 / 26.05 GB&lt;br /&gt;  Free  PE / Size       8 / 32.00 MB&lt;br /&gt;  VG UUID               iMVDA3-T9m7-oiTy-6aR4-P82G-jGrd-rfJC5S&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;mrpotato ~ # vgextend -t -v storage /dev/sdb1&lt;br /&gt;  Test mode: Metadata will NOT be updated.&lt;br /&gt;    Checking for volume group “storage”&lt;br /&gt;    Test mode: Skipping archiving of volume group.&lt;br /&gt;    Adding physical volume ‘/dev/sdb1’ to volume group ‘storage’&lt;br /&gt;    Wiping cache of LVM-capable devices&lt;br /&gt;    Volume group “storage” will be extended by 1 new physical volumes&lt;br /&gt;    Test mode: Skipping volume group backup.&lt;br /&gt;  Volume group “storage” successfully extended&lt;br /&gt;    Test mode: Wiping internal cache&lt;br /&gt;    Wiping internal VG cache&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;mrpotato ~ # pvscan&lt;br /&gt;  PV /dev/sda1   VG storage   lvm2 [4.88 GB / 0    free]&lt;br /&gt;  PV /dev/sda4   VG storage   lvm2 [4.26 GB / 0    free]&lt;br /&gt;  PV /dev/sdc1   VG storage   lvm2 [16.95 GB / 32.00 MB free]&lt;br /&gt;  PV /dev/sdb1   VG storage   lvm2 [16.95 GB / 16.95 GB free]&lt;br /&gt;  Total: 4 [43.03 GB] / in use: 4 [43.03 GB] / in no VG: 0 [0   ]&lt;br /&gt;mrpotato ~ # vgdisplay&lt;br /&gt;  --- Volume group ---&lt;br /&gt;  VG Name               storage&lt;br /&gt;  System ID&lt;br /&gt;  Format                lvm2&lt;br /&gt;  Metadata Areas        4&lt;br /&gt;  Metadata Sequence No  15&lt;br /&gt;  VG Access             read/write&lt;br /&gt;  VG Status             resizable&lt;br /&gt;  MAX LV                0&lt;br /&gt;  Cur LV                1&lt;br /&gt;  Open LV               1&lt;br /&gt;  Max PV                0&lt;br /&gt;  Cur PV                4&lt;br /&gt;  Act PV                4&lt;br /&gt;  VG Size               43.03 GB&lt;br /&gt;  PE Size               4.00 MB&lt;br /&gt;  Total PE              11016 &amp;lt;----&lt;br /&gt;  Alloc PE / Size       6669 / 26.05 GB&lt;br /&gt;  Free  PE / Size       4347 / 16.98 GB&lt;br /&gt;  VG UUID               iMVDA3-T9m7-oiTy-6aR4-P82G-jGrd-rfJC5S&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;mrpotato ~ # lvextend -l 11016 -t -v /dev/storage/storage&lt;br /&gt;  Test mode: Metadata will NOT be updated.&lt;br /&gt;    Finding volume group storage&lt;br /&gt;    Test mode: Skipping archiving of volume group.&lt;br /&gt;  Extending logical volume storage to 43.03 GB&lt;br /&gt;    Test mode: Skipping volume group backup.&lt;br /&gt;    Found volume group &quot;storage&quot;&lt;br /&gt;    Found volume group &quot;storage&quot;&lt;br /&gt;  Logical volume storage successfully resized&lt;br /&gt;    Test mode: Wiping internal cache&lt;br /&gt;    Wiping internal VG cache&lt;br /&gt;&lt;br /&gt;rpotato ~ # e2fsck -f /dev/storage/storage&lt;br /&gt;e2fsck 1.38 (30-Jun-2005)&lt;br /&gt;Pass 1: Checking inodes, blocks, and sizes&lt;br /&gt;Pass 2: Checking directory structure&lt;br /&gt;Pass 3: Checking directory connectivity&lt;br /&gt;Pass 4: Checking reference counts&lt;br /&gt;Pass 5: Checking group summary information&lt;br /&gt;/dev/storage/storage: 1167/3384128 files (14.1% non-contiguous), 2674943/6829056 blocks&lt;br /&gt;mrpotato ~ # resize2fs /dev/storage/storage&lt;br /&gt;resize2fs 1.38 (30-Jun-2005)&lt;br /&gt;Resizing the filesystem on /dev/storage/storage to 11280384 (4k) blocks.&lt;br /&gt;The filesystem on /dev/storage/storage is now 11280384 blocks long.&lt;br /&gt;&lt;br /&gt;&lt;a href=&apos;http://kbase.redhat.com/faq/FAQ_85_4842.shtm&apos; rel=&apos;nofollow&apos;&gt;http://kbase.redhat.com/faq/FAQ_85_4842.shtm&lt;/a&gt;&lt;br /&gt;&lt;a href=&apos;http://tldp.org/HOWTO/LVM-HOWTO/extendlv.htmlp&apos; rel=&apos;nofollow&apos;&gt;http://tldp.org/HOWTO/LVM-HOWTO/extendlv.htmlp&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;mrpotato ~ # emerge -av e2fsprogs&lt;br /&gt;&lt;br /&gt;These are the packages that would be merged, in order:&lt;br /&gt;&lt;br /&gt;Calculating dependencies... done!&lt;br /&gt;[ebuild     U ] sys-libs/com_err-1.40.2 [1.38] USE=&quot;nls&quot; 3,873 kB&lt;br /&gt;[ebuild     U ] sys-libs/ss-1.40.2 [1.38] USE=&quot;nls&quot; 0 kB&lt;br /&gt;[ebuild     U ] sys-fs/e2fsprogs-1.40.2 [1.38-r1] USE=&quot;nls -static&quot; 0 kB&lt;br /&gt;&lt;br /&gt;Total: 3 packages (3 upgrades), Size of downloads: 3,873 kB&lt;br /&gt;&lt;br /&gt;Would you like to merge these packages? [Yes/No] n&lt;br /&gt;&lt;br /&gt;With e2fsprogs &amp;gt;=1.39-1 new filesystems are created with directory indexing and &lt;br /&gt;on-line resizing enabled by default (see /etc/mke2fs.conf).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;mrpotato ~ # df -h&lt;br /&gt;Filesystem            Size  Used Avail Use% Mounted on&lt;br /&gt;/dev/sda3             6.9G  5.4G  1.5G  79% /&lt;br /&gt;udev                  505M  164K  505M   1% /dev&lt;br /&gt;none                  505M     0  505M   0% /dev/shm&lt;br /&gt;/dev/mapper/storage-storage&lt;br /&gt;                       43G  9.8G   33G  24% /storage</description>
  <comments>http://lotso.livejournal.com/101583.html</comments>
  <category>linux</category>
  <category>gentoo</category>
  <lj:security>public</lj:security>
  <lj:reply-count>2</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://lotso.livejournal.com/101134.html</guid>
  <pubDate>Sun, 23 Sep 2007 04:14:18 GMT</pubDate>
  <title>The Mantra of Successful BI</title>
  <link>http://lotso.livejournal.com/101134.html</link>
  <description>These days, my focus is geared towards learning and making SQL and in the past, I’ve tried to replicate from mssql to mysql and I wasn’t too successful in doing that due to various reasons.&lt;br /&gt;&lt;br /&gt;However, now I’m learning and playing more and more with postgresql and I’m getting more and more impressed with it as a Database. While the setup differs between the company’s mssql server and mssql in terms of the number of columns (I’m using PG more like a datamart as opposed to a data warehouse), I’m also limited to lesser hardware w/ only a 2G celeron w/ 1G ram to be deployed on.&lt;br /&gt;&lt;br /&gt;Anyway, that’s not the point of this post. The point of this post is to present to the larger community at hand on what makes a good BI Tool (Business Intelligence). BI is a hot topic these days what with everyone digging and mining through terabytes of raw or summarised data in search of the golden nugget. (I just read the papers today that the US Dept of Homeland Security is collecting and mining data on US residents to determine suspicious behaviour)&lt;br /&gt;&lt;br /&gt;So, what makes a good BI tool? Point and Click? Drag and Drop? (yet another) New interface?&lt;br /&gt;&lt;br /&gt;Let me tell you a story on what makes a good BI tool and something which (nearly) everyone can use w/o much re-learning. It’s a ubiquitus tool and though I’m largely a power user of it, I’m not recommending its use if you can utilise some other FOSS based versions. There’s nothing much to learn due to it’s ubiquitious nature and it’s pervasiveness.&lt;br /&gt;&lt;br /&gt;You get your drag and drop and you get your point and click as well. It’s interface is user friendly and it provides a familiar surrounding to your users.&lt;br /&gt;&lt;br /&gt;The tool, which I use and which I wrote macros for is called “EXCEL” (tm). Yep.. I (or rather we) use Microsoft(tm) Excel(tm) as a front end to the DB to obtain and to slice &amp; dice the data and it’s working well.&lt;br /&gt;&lt;br /&gt;Why Excel? I know there are lots of technical and non-technical reasons not to use Excel, but that’s besides the point. When you’re given lemon, you have to use it to make lemonade right? So, that was the situation provided to me and I grabbed it by it’s horns and make it work. &lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;u&gt;Adapt of die.&lt;/u&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Reading &lt;a href=&quot;http://andyonenterprisesoftware.com/2007/07/the-price-of-failure/&quot; rel=&quot;nofollow&quot;&gt;Andy&lt;/a&gt; and his opinion on “The Price Of Failure” on the comment by Madan Sheina on  the failure of BI projects.&lt;br /&gt;&lt;br /&gt;I especially like and I quote Andy on point #3.&lt;br /&gt;&lt;quote&gt;&lt;em&gt;&lt;br /&gt;...&lt;br /&gt;&lt;br /&gt;3. “Just one more new user interface” is not what the customer wants to hear. “Most are familiar with Excel and are not willing to change their business experience” was one quote from a customer in the article. Spot on! Why should a customer whose main job is, after all, not IT but something in the business, have to learn a different tool just to get access to data that he or she needs? Some tool vendors have done a good job of integrating with Excel, and yet are often in denial about this since they view their proprietary interface as a key competitive weapon against other vendors. Customers don’t care about this; they just want to get at the data they need to do their job on an easy and timely way. Hence a BI project should, if at all possible, look at ways of allowing users to getting data into their familiar Excel rather than foisting new interfaces on them. A few analyst types will be prepared to learn a new tool, but this is only a small subset of the audience for a BI project, likely 10% or less.&lt;br /&gt;&lt;br /&gt;&lt;/quote&gt;&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Like it or not, spreadsheets are here to stay.&lt;br /&gt;Spreadsheets in any form, gnumeric, openoffice calc, excel, koffice anything which can connect to a DB and retrieve data is your greatest BI tool.&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;http://searchcrm.techtarget.com/columnItem/0,294698,sid11_gci1081869,00.html&quot; rel=&quot;nofollow&quot;&gt;Rick Sherman, in a 2005 article wrote&lt;/a&gt; :&lt;br /&gt;&lt;quote&gt;&lt;em&gt;&lt;br /&gt;...&lt;br /&gt;For many years BI vendors have been building front-end tools to try to replace spreadsheets for querying, reporting and analyzing data results. But despite the fact that tens of thousands of BI tool licenses have been sold, spreadsheets are still the most pervasive and dominant tool.&lt;br /&gt;...&lt;br /&gt;&lt;/quote&gt;&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;Forrester even went to publish a whitepaper entitled “Ouch! Get Ready - Spreadsheets are Here to Stay for Business Intelligence” which can be downloaded &lt;a href=&quot;http://web1.forrester.com/forr/reg/campaignlogin.jsp?lr=/Marketing/Campaign2/1,6538,909,00.html&amp;amp;RegistrationID=1-BI38PD&amp;amp;regmode=marketingtrial&amp;amp;iCampaignID=909&quot; rel=&quot;nofollow&quot;&gt;here for free&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;quote&gt;&lt;em&gt;&lt;br /&gt;...&lt;br /&gt;“Spreadsheets — the most widely used business intelligence (BI) tool — are a permanent fixture in enterprises because no other analytical application outperforms them in flexibility, ease of use, and ubiquity. Spreadsheets’ role in BI is no longer limited to simple import/export mechanisms; they now play an integral role in all layers of the BI stack. Yet the lack of controls and security and integrity issues create tremendous challenges. To minimize risks while gaining the inherent BI value of spreadsheets, information and knowledge management professionals must discriminate between the different ways spreadsheets are used. Then, they must help users apply advanced spreadsheet tools and techniques to their daily jobs, while also implementing a tightly controlled (or closely monitored) environment for critical production processes that rely on spreadsheet data. In turn, vendors should take advantage of this market opportunity by introducing tools that will bridge the gap between spreadsheet management and spreadsheet usage in the BI process.”&lt;br /&gt;...&lt;br /&gt;&lt;quote&gt;&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;In the paper from Information Builders titled “Worst Practices in BI”, which can be downlaoded from &lt;a href=&quot;http://www.b-eye-network.com/files/2007%20Information%20Builders%20Worst%20Practices%20in%20BI%20WP.pdf&quot; rel=&quot;nofollow&quot;&gt;here&lt;/a&gt; which has this bit of anecdote by Ralph Kimball&lt;br /&gt;&lt;br /&gt;&lt;quote&gt;&lt;em&gt;&lt;br /&gt; “The majority of the user base likely will access the data via pre -built parameter-driven analytic applications. Approximately 90 to 95 percent of the potential users will be served by these canned applications that are essentially finished templates that do not require users to construct relational queries directly.”&lt;br /&gt;&lt;/quote&gt;&lt;/em&gt;&lt;br /&gt;&lt;br /&gt;Bi tools needs to satisfy the needs of the main business users and it needs to provide these data in a timely fashion to gain the most from the “time to market” and to maintain competitiveness. &lt;br /&gt;&lt;br /&gt;However, usage of spreadsheets is also a bane because of “operator error” as I will put it. One way to ensure that data is not calculated wrongly, is to ensure that users/operators need not apply formulas and other manual calculations themselves and just use the data as is. Excel or any other spreadsheet only provides the means (UI) for getting the data from the Data mart/warehouse for pivot tables.&lt;br /&gt;&lt;br /&gt;One thing which I’ve yet to be able to do is to determine how to get what WebFOCUS does(you’ll have to refer to figure 2 &amp; 3 of the report to understand) that is to also put the excel calculations into the spreadsheet when the data is exported.&lt;br /&gt;&lt;br /&gt;That, is a cool feature.&lt;br /&gt;&lt;br /&gt;So.. my friends, what BI tools are being used extensively in your environment? And Are your users using it properly? I know that I didn’t use the company’s new BI tool (actually, there was none since they pulled the plug on Business Objects) and asked users to render SQL themselves.</description>
  <comments>http://lotso.livejournal.com/101134.html</comments>
  <category>gentoo</category>
  <category>sql</category>
  <category>rants</category>
  <lj:security>public</lj:security>
  <lj:reply-count>6</lj:reply-count>
</item>
</channel>
</rss>

