Bartram's Bits

Wednesday, April 28, 2010

Thanks Alagad!

Alagad recently had a contest tied to cf.Objective to win one of their fine backpacks and I have been honored as one of the three winners. Congratulations also goes out to Steve Withington and Wil Genovese, winners of the other two backpacks. Details can be found here: Backpack Contest - We Have Winners!

Saturday, December 26, 2009

Creating a Twitter Background Image for all Resolutions

Twitter doesn't give much space in the profile section, but I've seen a few people who have used the background image to provide more details about themselves. I decided to create a new background image for my Twitter account, http://twitter.com/edbartram, in the same vein.

Creating a simple background image was easy, I fired up GIMP and created a new image 1600x1200. Next, I needed to know where to put the text, so I did some experimentation and came up with a template to handle multiple resolutions.

I created guides horizontally at 10, 110, 240, 895, and 1015 pixels. This provided 6 columns:

  • a margin

  • an area for detailed text

  • a column for a vertical url

  • an space where Twitter will display its content

  • another column for to repeat the vertical url

  • and a final column the expands off to the sunset

I also created a vertical guide at 75 pixels which split the graphic into 2 rows:

  • one for the Twitter header

  • the other for the content area.


Guides

Because Twitter centers its content in the browser page, it moves around when used with different resolutions. The guide values that I used above allowed me to create a Twitter background that displayed gracefully at the most common resolutions:

  • 800x600 - Twitter is optimized for 800x600 so the background image barely displays more than a few pixels.

  • 1024x768 - The detail column displays to the left of the the Twitter content whcih conceals the 1st url column, but leaves the 2nd url column to display on the right.

  • 1280x1024 - The detail column and the 1st url column displays to the left of the Twitter content which conceals the 2nd URL column.

  • 1440x900 - Looks just like 1280x1024 with a little more padding between the 1st URL and the Twitter content.

Friday, November 13, 2009

My Notes from Google Search Appliance Seminar

On Thursday, November 12, 2009 I attended a seminar at Google's Chicago office given by Bob Segal of Fig Leaf Software on the Google Search Appliance (GSA). The following are my notes from that two hour presentation:


  • The GSA returns search results according to the user's security. This means that the GSA can index secured content and only include it in search results if the user performing the search has access rights to the secured content.

  • Websites used as examples in the presentation:
    http://www.heritage.org/
    http://www.muschealth.com/

  • Mozzilla Firefox has a plugin called URL Params that Bob used frequently in the demonstration. It presents all variables in the URL scope in an easy to view form rather than editting the query string in the address bar. https://addons.mozilla.org/en-US/firefox/addon/1290
    After installing I found that it also supports editting form fields as well, but there is an extra step required to install the plug-in to make it function as a Sidebar: chrome://urlparams/content/lib/addpanel.xul

  • Search results can be categorized using both collections and meta data. The example shown was a column on the right hand side of the page showing number of results found in each category of the site allowing the user to narrow their search focus.

  • The term Universal Login was used and my understanding of the definition is that it is "using the same credentials across multiple authentication systems." This means the user still needs to login to each system, but they are using the same username and password rather than a different shema for each. Not quite Single SignOn, but a good workaround for disparate systems.

  • Great resource for learning how to use and support a GSA: http://www.learngsa.com/

  • Status and Reports > Crawl Diagnostics has a URL Status drop-down which can be used to limit the view to 404 Errors, Robot Exclusions, etc...

  • You can view search results in XML format by removing the proxystylesheet parameter. I tested this and you need to remove the variable completely, just setting it to a blank value yields a 500 server error.

    • You can add meta data to be displayed in the XML by setting &getfields=*

    • When searching you can limit the results to only include searching specific meta data fields:
      inmeta:author=[specific name]
      inmeta:author~[partial name]
      inmeta:title
      inmeta:site


  • Three URL parameters select what you are searching, how it looks, and how the results are modified:

    • client: This is your Front End excluding the Output Format section and controls your how your results are modified such as keywords, filtering, exclusions, etc.

    • proxystylesheet: this is the Output Format section of your Front End and controls the page's design

    • site: this is your collection name


  • Next, Dictionaries were discussed. This is where you can set up synonyms to guide the user to industry specific terms. The example used was the term "port". It means different things to the shipping industry, wine makers, and computer hardware builders. I believe this is the Query Expansion item under the Serving section for GSA 5.2.0.

  • In Serving > Front Ends > Output Format > Page Layout Helper > Global Attributes there's a setting Enable ASR / Enable Advanced Search Reporting. I never researched what this meant, but it sounded like a good thing, so I clicked it. The presenter talked about Self Learning Scoring which is where the GSA learns which links are better results based on which links people click on in search results. This is ASR, Advanced Search Reporting.

  • GSA Unification is for connecting mutliple GSAs together and is available in version 6.x for the newer GSA machines, but not the 1001.

  • Search Dates are controlled by document dates. The presenter recommended that we select meta, provide a custom meta field and modify our pages to include that meta field and control what date we want the content to use. The demo looked like there was a dropdown in 6.x, but in 5.2.0, there are two sections on the Serving > Result Biasing config page.

  • The GSA is case sensitive for its crawls (I'm assuming because it is running on a *nix platform), so the same document could be returned several times in the results depending on how it is called. For example, MyDocument.doc and mydocument.doc are two different results. Now that I'm thinking about it more though I'm not sure if this means the document has to exist with two different cased names or if it is simply linked with two differently cased names. I was under the impression that it would be indexed multiple times if there were multiple links with different variations of case.

  • Another useful link was given: http://gigz.com/google.htm
    Bonus: the homepage contains useful links for a variety of other technologies that I'm interested in!

  • And finally, I was excited to hear ColdFusion mentioned in the presentation! w00t!



Links from notes:
http://www.heritage.org/ - example of basic search using GSA
http://www.muschealth.com/ - example of a search integrated into the website using GSA
https://addons.mozilla.org/en-US/firefox/addon/1290 - URL Params Firefox Plug-in
chrome://urlparams/content/lib/addpanel.xul - Enables URL Params in Firefox as a Sidebar
http://www.learngsa.com/ - Learning site for the Google Search Appliance
http://gigz.com/google.htm - Collection of useful links for the Google Search Appliance


This training is still available in Dallas at the time of writing this entry. You can register at http://www.figleaf.com/Training/GoogleSeminar.cfm

Wednesday, May 20, 2009

Multiple Instances in ColdFusion 7

I've always been aware that you could create multiple instances in ColdFusion, but I always thought it would be some complicated arcane process. So up until recently, I never felt compelled to attempt it. In reality I found it to be rather easy and have successfully created multiple instances in ColdFusion 7.0 using both Apache 2.0 and IIS 6.0 webservers (Yes, I know, there are newer versions out there, but we can't all be cutting edge). In my research, I noticed that while the documentation available on the internet was sufficient, it was a little sparse. So I hope by sharing my notes that I will help someone else as well.

ColdFusion

Create showInstance.cfm in a new web folder:
<cfobject action="create" type="java" class="jrunx.kernel.JRun" name="jr">
<cfset servername = jr.getServerName()>
<cfoutput>JRun Server Name: #servername#</cfoutput>
Create instance in cfadmin
  • Enterprise Manager > Instance Manager
  • Add New Instance
  • Server Name: whatever you want to call the instance
  • Server Directory: let CFadmin handle this
  • Create From EAR/WAR: leave blank
  • check Create Windows Service
  • check Auto Restart Service
  • Submit
  • Wait...
Edit JRun4/servers/________/SERVER-INF/jrun.xml
  • search for JRunProxyService
  • change deactivated from true to false
  • note port

APACHE Web Server

Edit httpd.conf
add virtual mapping 
JRunConfig Serverstore "C:/JRun4/lib/wsconfig/1/jrunserver.store"
JRunConfig Bootstrap 127.0.0.1:_____ *port goes here
JRunConfig Apialloc false
Web Server Configuration Tool
  • Add
  • JRun Host: localhost
  • JRun Server: select instance created in CFadmin
  • Web Server: select Apache
  • Configuration Directory: browse to where httpd.conf is located
  • check Configure web server for ColdFusion MX applications
  • OK
Apache Service Monitor
  • Stop Apache
  • Start Apache

IIS Web Server

Create web site
  • Run Internet Information Services (IIS) Manager
  • Expand local computer
  • Right click on Web Sites and select New > Web Site
  • Next
  • Description: type name of new website
  • Enter the IP address to use for this Web site: (All Unassigned)
  • TCP port this Web site should use: 80
  • Host header for this Web site: same as recorded in DNS or hosts file
  • Next
  • Path: Browse to the folder you created above
  • Next
  • Next
  • Finish
  • Right click your new site and select New > Virtual Directory
  • Next
  • Alias: CFIDE
  • Path: Browse to wherever your CFIDE folder is, for example: c:\inetpub\wwwroot\CFIDE\
  • Next
  • Next
  • Finish
Web Server Configuration Tool
  • Add
  • JRun Host: localhost
  • JRun Server: select instance created in CFadmin
  • Web Server: select Internet Information Server (IIS)
  • IIS Web Site: select the site you created above
  • check Configure web server for ColdFusion MX applications
  • OK

Test your new instance

Open your browser to new site and verify that it is running the new instance with showInstance.cfm

Tuesday, March 24, 2009

Converting Model-Glue Docs to HTML

I saw Ray Camden's Friday Puzzler - Helping the Model-Glue Team from March 13th today and noticed a solution had not been posted yet. I gave it some thought and had visions of recursive CFCs parsing out each HTML tag using Regular Expressions initially. But then I realized the task wasn't to create a tool to parse the HTML out of any webpage, but to parse the HTML out of these specific webpages.

Digging into the source code of these pages generated by RoboHelp, I began to see a structure and a set of rules that appeared to be followed on each page. I discovered the navigation tree was comprised of 19 webpages each representing a folder with links to the other folders and the documents contained within itself. My initial code looped through each of these files and stripped out anything above <body> and below </body> using the FindNoCase() function. Next, I added code to loop through the resulting HTML looking for links. Again, the code kept to rules, so I was able to parse out the links to folders by looking for target="_self". Any links that didn't refer to "_self" were the documents we were looking to "scrape" the content from. At this point I cheated a little; I now had a list of all the documents and the folder structure that they were stored in, so I manually created the folders mirroring the structure on the website. I figured this was ok, as I was creating a one-time process, not a reusable application. Finally, getting back to coding, I looped through each document stripping out anything above <h1 and below the <script beneath the content.

I appologize that the Blogger template I'm using doesn't handle code very well. You can download the code as well as all the docs in the link below:

Download docs.model-glue.com.zip

<!--- I manually reproduced the folder structure of the Model-Glue docs on my local harddrive --->
<cfset pathRemote = "http://docs.model-glue.com/whgdata/">
<cfset pathLocal = "c:/docs.model-glue.com/whgdata/">

<!--- The navigation tree for the docs in RoboHelp is comprised of 19 html files --->
<cfloop index="ptrTree" from="0" to="18">
<cfset fileName="whlstt#ptrTree#.htm">
<cfset fileRemote="#pathRemote##fileName#">
<cfset fileLocal="#pathLocal##fileName#">

<!--- Call up each of the 19 html files that make up the navigation tree and loop through finding each document --->
<cfhttp url="#fileRemote#" method="get" resolveurl="yes" throwonerror="yes"></cfhttp>
<cfif cfhttp.statusCode is "200 OK">
<p><strong><cfoutput>#fileName#</cfoutput></strong><br /><cfflush>
<cfset treeHTML=cfhttp.FileContent>
<cfset ptrLink=1>

<!--- Loop through each link it the navigation tree looking for links to documents --->
<cfloop condition="ptrLink lt len(treeHTML)">
<cfset startLink=FindNoCase("<a href=",treeHTML,ptrLink)>
<cfif startLink gt 0>
<cfset endLink=FindNoCase("</a>",treeHTML,startLink)+3>
<cfset tmpLink=mid(treeHTML,startLink,endLink-startLink+1)>

<!--- Found a link to a document, so parse out the url and link title --->
<cfif Not(FindNoCase("_self",tmpLink))>
<cfset startURL=FindNoCase("http://",tmpLink)>
<cfset endURL=FindNoCase(".htm",tmpLink,startURL)>
<cfset startImg=FindNoCase("<img",tmpLink,endURL)>
<cfset startTitle=FindNoCase(">",tmpLink,startImg)+1>
<cfset endTitle=FindNoCase("</a>",tmpLink,startTitle)>
<cfset pageURL=mid(tmpLink,startURL,endURL-startURL+4)>
<cfset pageTitle=mid(tmpLink,startTitle,endTitle-startTitle)>
<cfoutput><a href="#pageURL#">#pageTitle#</a></cfoutput><br /><cfflush>

<!--- Call up the document and parse out just the HTML throwing out the extra code --->
<cfhttp url="#pageURL#" method="get" resolveurl="yes" throwonerror="yes"></cfhttp>
<cfif cfhttp.statusCode is "200 OK">
<cfset pageHTML=cfhttp.FileContent>
<cfset startContent=FindNoCase("<h1",pageHTML)>
<cfset endContent=FindNoCase("<script type=",pageHTML,startContent)>
<cfset pageContent=Mid(pageHTML,startContent,endContent-startContent)>

<!--- Write out the content HTML using the same folder structure --->
<cfset pageLocal=Replace(Replace(Replace(pageURL,'/whgdata/../','/'),':80',''),"http://","c:\")>
<cffile action="write" file="#pageLocal#" output="#pageContent#">
<cfelse>
<cfdump var="#cfhttp#">
<cfabort>
</cfif>
</cfif>

<!--- Update the ptr used for looping --->
<cfif endLink gt 0>
<cfset ptrLink=endLink+1>
<cfelse>
<cfset ptrLink=len(treeHTML)+1>
</cfif>
<cfelse>
<cfset ptrLink=len(treeHTML)+1>
</cfif>
</cfloop>

<!--- Parse out just HTML in the Navigation Tree file throwing out the extra code --->
<cfset startHTML=FindNoCase("<body",treeHTML)>
<cfset endHTML=FindNoCase("</body>",treeHTML,startHTML)>
<cfset fileHTML=mid(treeHTML,startHTML,endHTML)>

<!--- Write out the Navigation Tree file --->
<cffile action="write" file="#fileLocal#" output="#fileHTML#">
<cfelse>
<cfdump var="#cfhttp#">
<cfabort>
</cfif>
</p>
</cfloop>

</p><h1>Done!</h1>