» Fig Leaf Software Home

We've Got You Covered.

Friday, July 31, 2015

Screen scraper changes between GSA 7.0 and 7.2+

Background

A few years back, the Google team at Fig Leaf Software built a custom application in .NET to manage Google Search Appliance functionality using a rules engine of sorts, rather than requiring manual interaction with the GSA. I didn't build it; we have a very skilled .NET development team that did almost all of the heavy lifting, and all I had to do was build a very simple prototype.

To interact programmatically with the GSA, Google provides an administrative API that can be accessed from any language, along with client libraries for .NET and Java. Unfortunately, not all admin console functionality is exposed via the admin API. You probably know what that means - screen scraping is needed if you want to access that functionality. Writing screen scrapers is no fun, because the data format is likely to change pretty frequently, and that's exactly what happened when this customer upgraded from GSA 7.0 to 7.2. All the admin API functionality worked, but the ability to upload and delete synonym files no longer worked, because that relied on screen scraping.

Of course, the request/response format for screen scraping is undocumented, so you typically have to figure out what the server is looking for using a recording proxy or packet sniffer. I really like Fiddler for this sort of thing. It's a lot less complicated than something like Wireshark, and really shows you everything you need to see in HTTP. That's basically the approach we followed to build the initial application, and it's how we upgraded it to support GSA 7.2/7.4. If you know how to build screen scrapers, there's nothing you can't figure out on your own, but I thought this might save a few valuable hours for someone out there.

Login process

The first problem the customer reported was that the login was failing on 7.2. Foolishly, I thought that would be the only change - what was I thinking? Nevertheless, that obviously had to be resolved first, so I took a look at the login process against a 7.0 vs a 7.2 GSA.

On 7.0, the login process is pretty simple. The client sends a GET request, and the first response from the GSA sets a session cookie. Then, the client sends a POST request with a MIME type of application/x-www-urlencoded that contains three parameters, like this (assuming a username  "admin" and password "figleaf"):

POST /EnterpriseController HTTP/1.1
Content-Type: application/x-www-form-urlencoded
Host: gsa.figleaf.local:8000
Cookie: S=enterprise=P10OlkGyca0
Content-Length: 60
Expect: 100-continue

actionType=authenticateUser&username=admin&password=figleaf

On 7.2, things are a bit more complicated. The HTML form on the GSA itself creates two parameters, actionType and reqObj. The second parameter is an array, and prior to being URL-encoded contains a value like this:

reqObj=[null,"admin","figleaf",null,1]

I have no idea what the other parameters represent, but they don't seem to change, so I don't care! The string containing the parameters must be URL-encoded, so you end up with something like this for the entire POST request:

POST EnterpriseController?a=1 HTTP/1.1
Content-Type: application/x-www-form-urlencoded
Host: gsa.figleaf.local:8000
Cookie: S=enterprise=P10OlkGyca0
Content-Length: 87
Expect: 100-continue

actionType=authenticateUser&reqObj=%5Bnull%2C%22admin%22%2C%22figleaf%22%2Cnull%2C1%5D

There's another difference, too - the action URL has an extra parameter, a=1. I don't really know what that represents, but you won't successfully login without it.

One last issue, which will occur with many screen scraping operations, is that you typically want to search the response for specific bits of text to extract values, or learn whether the operation succeeded. Many of these had changed between 7.0 and 7.2. Specifically, for 7.2, we can look for the text "login_err" which is actually quite nice!

Query Settings page

Once that was working, we quickly discovered that the synonym management functionality didn't work. Since that was the entire purpose of this screen scraper, fixing the login wasn't enough.

To upload a new synonym file, you need a POST request with a MIME type of multipart/form-data. Here's what that looks like for GSA 7.0:

POST /EnterpriseController HTTP/1.1
Content-Type: multipart/form-data; boundary=----------8d2993bd51d1414
Host: gsa.figleaf.local:8000
Cookie: S=enterprise=KbhJUSnSvA4
Content-Length: 853
Expect: 100-continue

------------8d2993bd51d1414
Content-Disposition: form-data; name="type";

0
------------8d2993bd51d1414
Content-Disposition: form-data; name="syn_lang_select";

en
------------8d2993bd51d1414
Content-Disposition: form-data; name="sw_lang_select";

all
------------8d2993bd51d1414
Content-Disposition: form-data; name="itemName";

tst_synonyms_en
------------8d2993bd51d1414
Content-Disposition: form-data; name="actionType";

updateQueryExp
------------8d2993bd51d1414
Content-Disposition: form-data; name="security_token";

AJhxEn3QZlHwjh8Idgd_q51Fh8c:1438315574127
------------8d2993bd51d1414
Content-Disposition: form-data; name="upload";

Upload
------------8d2993bd51d1414
Content-Disposition: form-data; name="fileName"; filename="tst_en.txt"
Content-Type: text/plain

{test1,test2}
------------8d2993bd51d1414--

With GSA 7.2, the "type" field is now "qeType", and you need to add "a=1" again.

POST /EnterpriseController HTTP/1.1
Content-Type: multipart/form-data; boundary=----------8d2993d49e9990f
Host: gsa.figleaf.local:8000
Cookie: S=enterprise=ccjbLm2KD2Y
Content-Length: 932
Expect: 100-continue

------------8d2993d49e9990f
Content-Disposition: form-data; name="qeType";

0
------------8d2993d49e9990f
Content-Disposition: form-data; name="a";

1
------------8d2993d49e9990f
Content-Disposition: form-data; name="syn_lang_select";

fr
------------8d2993d49e9990f
Content-Disposition: form-data; name="sw_lang_select";

all
------------8d2993d49e9990f
Content-Disposition: form-data; name="itemName";

tst_synonyms_en
------------8d2993d49e9990f
Content-Disposition: form-data; name="actionType";

updateQueryExp
------------8d2993d49e9990f
Content-Disposition: form-data; name="security_token";

bYkoFblS1Nm70sAOrPeGcdFgF04:1438316205640
------------8d2993d49e9990f
Content-Disposition: form-data; name="upload";

Upload
------------8d2993d49e9990f
Content-Disposition: form-data; name="fileName"; filename="tst_en.txt"
Content-Type: text/plain

{test1,test2}
------------8d2993d49e9990f--

Odds and ends

I didn't run into this myself, but you may run into AJAX calls for some changes. The GSA admin console has changed quite a bit in GSA 7.2, and uses AJAX for some functionality. If so, you'll need to extract the security token from the previous AJAX response rather than from the form. For each data submission, a one-time-use security token is injected into each form or AJAX response, and you have to send it back with the subsequent request. In the case of synonyms, the security token is still in a form, but I did notice it in some of the AJAX responses I got while doing other things.

Conclusion

Ideally, we should never have to write screen scrapers. If you need something that isn't exposed by the admin API, open a support ticket and submit a feature request - maybe it'll be in the next admin API version! But if you have an existing screen scraper for GSA 7.0 and can't wait for a Google API upgrade, you may find this useful when upgrading to GSA 7.2 or higher.

[Note: cross-posted on Dave Watts' personal blog]

Wednesday, June 17, 2015

New Sass Class Added: Fast Track to Syntactically Awesome Stylesheets

Washington, DC and Online - Fig Leaf Software, a full service digital agency and technical training firm, today added a new Fast Track to Syntactically Awesome Stylesheets class.  


Sass is the most mature, stable, and powerful professional grade CSS extension language in the world. Sass lets you use features that don't exist in CSS yet like variables, nesting, mixins, inheritance and other nifty goodies that make writing CSS fun again. Once you start tinkering with Sass, it will take your preprocessed Sass file and save it out as a normal CSS file that you can use in your web site. This one-day course covers Sass and Compass (a CSS authoring framework for Sass) development essentials. Whether you're developing a static site, a dynamic site with a CMS (Drupal), or building an advanced web-based application (Sencha), you'll find that Sass is an indispensable tool for taming and optimizing your CSS.


Course Prerequisites


To gain the most from this class, you should already have:
  • A basic understanding of CSS syntax
  • Familiariaty with basic programming concepts such as defining variables and using functions

Course Objectives


During this 1 day hands-on, instructor-led course you will refactor a very long and difficult to maintain CSS file into a set of .scss (Sassy CSS) files. You'll learn how to restructure your CSS for optimal maintainabilty and produce minified production builds. You'll also learn how to use advanced CSS tricks such as base-64 encoding of images to significantly increase the performance of your web sites and web applications.

Course Outline


UNIT 1: INTRODUCING THE COURSE

UNIT 2: INTRODUCING SASS
  • Introducing Compass and Sass
  • Debugging CSS
  • Working with Variables
  • Using Partials to Organize your Stylesheet
UNIT 3: WORKING WITH SASS
  • Nesting Rules
  • Creating Responsive Themes with @media Directives
  • Defining Inheritance with @extend
  • Defining and Invoking Mixins
  • Using Control Directives
  • Defining Sass Functions
UNIT 4: USING COMPASS FOR COMPATIBILITY AND PERFORMANCE
  • Configuring Browser Support
  • Using Compass to Support CSS3 Features
  • Using Compass for Typography
  • Embedding Images in your Stylesheet
  • Using Sprites to Improve Performance

Fig Leaf Software has trained more than 35,000 web designers, developers and marketers in the nuances of web and mobile design and development.  Fig Leaf is a Service-Disabled, Veteran-Owned Small Business (SDVOSB) providing consulting, training and software solutions on GSA Schedule to government agencies, universities, companies, and nonprofits in DC, MD, VA and across North America. 

Thursday, June 4, 2015

New Fast Track to Adobe ColdFusion 11 Training Launched!

Washington DC and Online -- Fig Leaf Software recently launched a new Fast Track to ColdFusion 11 training class.  The new FTCF 11 class is a 3-day course that provides experienced Web developers with the knowledge and hands-on practice they need to start building and maintaining dynamic and interactive Web applications using the ColdFusion application server. 

This course was authored by Fig Leaf Software, based on decades of practical ColdFusion consulting experience. Adobe ColdFusion is available as a private or public class.  For private classes, please contact Steve Drucker at 415-8483 or email training@figleaf.com

Register Today

ColdFusion 11 Course Prerequisites

To gain the most from the class, you should already have:
  • A familiarity with Web terminology
  • An understanding of Web server characteristics
  • Experience with the HTML tag set and syntax
  • Familiarity with the SQL command set, including SELECT, INSERT, UPDATE, and DELETE

Course Objectives

  • Create a connection to a database using the ColdFusion Administrator
  • Use ColdFusion Builder to efficiently develop and troubleshoot code
  • Capture information in HTML forms
  • Read and write information to/from a database.
  • Represent complex structures using abstract data types รข€“ lists, arrays, and structures
  • Separate your application into a three-tiered architecture of User Interface components, Business Logic, and SQL for easier maintainability and flexibility
  • Dynamically send electronic mail
  • Secure your application using a password-based framework
  • Implement a RESTful API to support modern javascript-based web apps
  • Implement a simple ColdFusion application using best practices.

Course Outline


Unit 1: Introducing the Course

  • Meeting the Prerequisites
  • Understanding the Course Format
  • Reviewing the Course Outline

Unit 2: Introducing ColdFusion [click here to download this chapter]

  • Reviewing ColdFusion's Features and Capabilities
  • Introducing the ColdFusion Administrator
  • Working with ColdFusion Builder
  • Debugging and Troubleshooting your Apps

Unit 3: Getting Started with CFML

  • Working with Variables
  • Commenting Code
  • Using Functions
  • Creating Functions
  • Using Control Logic
  • Including Common Code

Unit 4: Using the Application Framework

  • Using Application.cfc to Define an Application
  • Implementing ColdFusion Components
  • Implementing Roles-Based Security

Unit 5: Querying Databases

  • Using <cfquery> to retrieve data
  • Implementing Search Forms
  • Inserting New Records
  • Supporting File Uploads
  • Updating Existing Records
  • Deleting Data
  • Invoking Stored Procedures

Unit 6: Dynamically Generating Office Documents

  • Generating PDFs
  • Generating Excel Files
  • Sending Email

Unit 7: Designing and Implementing RESTful APIs

  • Exposing CFC Methods for Remote Access
  • Implementing a RESTful API
Register Today

Wednesday, June 3, 2015

The Citizen Centric Journey: Using the Inbound Methodology for Government

From the “Buyer’s Journey” to the “Citizen Centric Journey”: Transforming citizen service in government with inbound marketing techniques

By Bret Peters, CMO, Fig Leaf Software
LinkedIn @petersbret


As government agencies strive to find new and better ways to serve citizens, those that have discovered the power of inbound are reaping the rewards.  Government agencies are re-inventing the traditional buyer’s journey to create a citizen centric journey which provides improved citizen engagement experiences.

For corporations, the buyer’s journey reflects the stages that a prospective buyer would travel as they seek to solve a need and buy a good or service from a company.  Since most government agencies are not selling goods or services, the buyer’s journey is better represented as citizen centric journey, one which serves to outline the path that a citizen takes to get information or services from a government agency.

Utilizing an inbound methodology, government agencies are shifting from a one-size-fits-all approach to a more personalized approach to citizen centric engagement, which focuses on people, and takes advantage of technology to support the most viable approaches to improving customer service.

In government, the inbound movement is a result of government employees striving to take advantage of technology and industry best practices to positively affect change and improve customer service.  Utilizing an inbound methodology to serve citizens is an efficient and cost effective approach in a some-to-many service environment -- where citizens have come to expect government to adapt to new technologies and keep pace with innovation.

Citizen Centric Journey by Fig Leaf Software and HubSpot.jpg

The stages of an inbound methodology for government are outlined below.  As citizens, we typically don’t engage our government until we need something -- We are “Citizens”.  At any point that we begin to experience a need or desire for a service offered by a government agency we typically begin to search for information or articles -- We move to the “Citizen Visitor” stage.

Once we find the agency that can help us, we engage that agency through some form of communication.  Using an inbound methodology we would likely fill out a form or download some content -- We become a “Citizen Contact”. In the case of a form requesting information, the form data would be routed internally within the agency to direct us either to the content we desire or to a department or individual who can assist us.  The form data may be routed through a simple or complex workflow depending on the nature of the request, and it may be routed using email or a Customer Relationship Management (CRM) software system such as Salesforce.com.  Once properly routed, we receive the information or services we require -- We become the “Serviced Citizen”.  With the proper response, follow up and nurturing, we become a “Satisfied Citizen” who will communicate our experiences to friends, family, co-workers and others.  

Government agencies who utilize an inbound methodology spend time thinking about how to delight their citizens.  They build content to attract citizens who need their services.  They create user-friendly web and mobile experiences to invite visitors to engage with them and they manage their contacts with smart content and great experiences throughout simple and complex workflows to ensure they are providing excellent service.  Perhaps most important however, government agencies who subscribe to an inbound methodology are monitoring their effectiveness and seeking ways to constantly improve and provide a more citizen centric journey.


Free Inbound Marketing Assessment


Friday, May 29, 2015

How to make your webinars more engaging and interactive

Join Adobe, Fig Leaf Software and the Content Marketing Institute as we host Ken Molay of Webinar Success for a one of a kind Adobe Connect event. 

See more free Adobe Connect best practices events at www.figleaf.com/events. 


Join Ken Molay, president of Webinar Success, as you learn techniques for building interactivity in your lead generation web seminars. You will explore ways to more fully involve your audience in on online presentation. You have the power to engage your audience in active participation that improves the persuasiveness of your message and their retention of your  material.
Content Marketing Institute
This seminar is appropriate for anyone who presents via web seminars. We will focus on marketing applications designed to promote your business, products, or services. A live question and answer session will let you focus on the issues of most importance and benefit to your organization.
You will learn:
1.  How to configure a presentation to incorporate interactivity
2.  Presentation techniques that encourage active participation
3.  Conferencing features that support interactivity and how best to use them
4.  When to use interactivity and when to avoid it
5.  Potential pitfalls of interactivity and how to protect yourself

Register Today
Date: 
Wednesday, June 3, 2015 - 2:00pm to 3:00pm

About Us

Fig Leaf Software is an award-winning team of imaginative designers, innovative developers, experienced instructors, and insightful strategists.

For over 20 years, we’ve helped a diverse range of clients...

Read More

Contact Us

202-797-7711

Fig Leaf Software

1400 16th Street NW
Suite 450
Washington, DC 20036

info@figleaf.com