by Dan Nicollet - May 15, 2012
Exorbyte Commerce will be attending the Internet Retailer Conference & Exhibition 2012 (IRCE 2012) June 5-8, 2012 at the McCormick Place West, Chicago, IL.
We look forward to meet many of the more than 9,000 ecommerce professionals expected to attend. We will be giving away gifts including an iPad 3. Visit us at Booth #408 for your chance to win!

Receive a $100 discount on your IRCE 2012 pass if you register in the form below!
* Some restrictions apply – see our booth or call sales at +1 (888) 673-0148 for details
Tags: $100, 2012, code, discount, discount code, Exorbyte, exorbyte commerce, internet retailer, internet retailer conference & exhibition, irce, irce 2012, irce 2012 discount code, pass, pass discount
Posted in Business, Exorbyte, Merchandising, Search, Software, Usability, eCommerce | No Comments »
by Dan Nicollet - May 8, 2012
Exorbyte Commerce will be attending the Internet Retailer Conference & Exhibition 2012 (IRCE 2012) June 5-8, 2012 at the McCormick Place West, Chicago, IL.
We look forward to meet many of the more than 9,000 ecommerce professionals expected to attend. To celebrate this occasion, we will be giving away gifts including an iPad 3 (watch for future announcements or just enter drawing at the show). Receive a $100 discount on your IRCE 2012 pass if you register in the form below!
Tags: $100, 2012, code, discount, discount code, Exorbyte, exorbyte commerce, internet retailer, internet retailer conference & exhibition, irce, irce 2012, irce 2012 discount code, pass, pass discount
Posted in Business, Exorbyte, Merchandising, Search, Software, Technology, Usability, eCommerce | No Comments »
by Dan Nicollet - April 27, 2012
We are proud to announce the release of our new Magento extension. Magento has been a growing force in the ecommerce platform industry with many thousands of online stores relying on its rich online retail platform.
The new Exorbyte Commerce Magento extension is available for free in one click from Magento Connect. Try it today and see the power of Exorbyte Commerce speed and error tolerance. Your store sales and usability are guaranteed to benefit from it!
Tags: 1.5, 1.6, 1.6.1, 1.6.2.0, eCommerce, in-store search, magento, magento community, magento connect, magento extension, online retail, site search
Posted in Business, Exorbyte, Merchandising, Search, eCommerce | No Comments »
by Dan Nicollet - March 27, 2012
“Data Scientists Needed”. These three words seem to be in every mouth for the past few weeks. Ever since the birth of the public Internet, business and political leaders have warned about the growing shortage of software and computer engineers needed by the growing high tech sector. The gap in data analysts for Big Data created by these new systems has been overlooked. In 2009, Hal Varian (Google Chief Economist) predicted that “Statistician will be the next sexy job”. McKinsey also insists on the scale of this new type of data which it sees as a next frontier for innovation, competition, and productivity.
Evidence now shows that the supply of these new professionals coined Data Scientists, comes short of what the market demands.

What is a data scientist? EMC2 survey.
A less covered topic is what tools Data Scientists use for their fabled new craft. Today’s Data Scientists are quite different from traditional Statisticians in that they deal with a different scale and type of data. Traditional analysis tools are just not fast, scalable, and sophisticated enough for the job. For instance, many new web applications used by businesses to advertise and record transactions with customers, partners, and the public at large generate so much data that new data warehousing tools have to be used for storage and analysis (Hadoop, and other columnar or in-memory databases). Traditional statistical analysis methods are not always sufficient either to look at data and makes sense of it. The data is not always structured and there is so much of it that it takes too long to recognize the meaningful patterns without some AI power (Artificial Intelligence). Machine learning and specialized algorithms for pattern recognition are there to help. IDEs like Matlab and Octave (or R and Processing) allow Data Scientists to develop the analysis apps they need much faster by providing pre-programmed libraries of functions for data handling, normalization, data quality, enrichment, cleansing, analysis, etc.
Exorbyte has come to learn how MatchMaker can also offer a unique set of tools for Data Scientists. MatchMaker’s ability to store massive amounts of data and perform very fast, transparent, and configurable fuzzy matching or search on that data offers data scientists a new way to drill down into data. Visualization is great to see aggregates and overall patterns but fuzzy search through data allows to acquire a new sense of its depth and nature. While tranforming, nmormalizing, and merging data from various sources, a fast error-tolerant matching engine is always necessary. The ability to run batch tasks in munutes instead of hours or days allows a more agile trial and error approach. MatchMaker features a near real-time in-memory database index which can be accessed through a number of extensions for search, data quality (cleansing, merging, deduplication), and data matching. It features advanced APIs (Java, Python, C, .Net, PHP, TCL) which allow for many specialized parameters. As an example, MatchMaker can perform lookups in 50 Million name, address and telephone records using full Levenshtein edit distance on all fields and all rows in under 10 ms, a simple laptop with 2 gigs of RAM and a quadcore CPU. That’s the fastest we have ever come across in years of competing within the enterprise software sector.
So if you are Data Scientists and you want to give MatchMaker a try, contact us.
For more Data Scientist tools, consult this excellent post: http://www.thedatascientist.com/2011/07/22/essential-tools-for-a-data-scientist/
The EMC2 research is here: http://www.emc.com/collateral/about/news/emc-data-science-study-wp.pdf
Tags: AmCharts, big data, BLAS, D3.js, data science, data scientist, data scientists, data scientists tools, Flare, Google Visualization API, haddop, hadoop, Hbase, Hive, LAPACK, Mahout, matlab, mysql, NumPy, processing, Protovis, r, Raphael.js, saas, SciPy, statistician, statisticians, statistics, stats, tools for data scientists, Weka
Posted in Business, Exorbyte, Search, Software, Technology, data matching, databases | No Comments »
by Dan Nicollet - March 6, 2012
At Exorbyte, we recently convened for a week-long strategy meeting. As we were peering over the past couple of years of sales data and our product road map for the MatchMaker platform, we uncovered an interesting fact. A number of our existing enterprise customers belong to a category of enterprise software solutions we had not identified as a target market yet: Business Process Automation (BPA) and Business Process Management (BPM).
Business Process Management is the discipline to which Business Process Automation belongs. BPA is a tactic organizations use to automate processes in order to contain costs (usually the ultimate goal of BPM). It involves integrating applications, restructuring labor resources and using software applications to automate routine tasks. Exorbyte has done deals with government agencies, insurance companies, online media and others that always involved the same scenario:
- an organization has already automated one or several processes
- it is forced to process a significant number of operations manually or forego some of its revenue because automation is deemed technically unfeasible (see example: BKK: “Up to 70,000 documents per day”).
At this stage it is important to note that manual processing is usually required because software cannot automate the matching between master data contained in company databases and non-standardized inputs (examples: manual order entries by field sales, search queries from web users, EDI documents with misspelled values, OCR insurance claims with missing characters, etc.)
- The organization or its BPA provider researches ways of enhancing the automated match rate with “fuzzy” (error-tolerant) logic. This involves matching misspellings, approaching numbers (IDs) and even approaching sentences.
At this age, organizations discover the limits of simple solutions to error-tolerant matching:
a. processing large amounts of complex queries, within large databases
b. controlling and guaranteeing the accuracy of matches (what’s an acceptable
match, what’s better still to queue for manual processing?)
c. how to deal with matching a sentence, paragraph, a multi-field record,
a whole document with database records?
d. How to match special data types (words, numbers, IDs, etc.)?
- Exorbyte MatchMaker offers all the control and service-oriented ease that such challenges require.

MatchMaker can eliminate much of the manual processing costs of BPA applications
It turns out that Exorbyte has such solutions already running at Yahoo!, the German Finance Ministry, Allianz (Fortune 15 and largest insurance company in the EU), and more. What’s even more interesting is that these customers usually view MatchMaker as a “must-have” mission critical component with an unusually impressive ROI. So we decided to really enhance MatchMaker’s solutions services for this segment by publishing a new case study and a data sheet about MatchMaker BPA solutions. There will be more on this topic in the months to come.
Tags: automate, automation, BPA, BPCC, BPM, Business Process Automation, business process competency center, Business Process Management, data, fuzzy, manual processing, matchmaker, optimize
Posted in Banking, Business, Exorbyte, Government, Insurance, Software, databases | No Comments »
by Dan Nicollet - March 5, 2012
Exorbyte’s SaaS software, Exorbyte Commerce has almost nothing to do with desktop software. This said, we have become firm believers in the SaaS model and have observed its potential and limitations.
Most of the large cloud software revenues have so far come from Web applications that replaced existing desktop systems (CRM, MS Office-like products like Google Apps, etc.) Microsoft’s own model for ASP versions of its Office product line was complex and give little new benefits to users and administrators.

Numecent just came out of stealth mode and explained what it has been quietly working on: cloudpaging. It is new way of deploying desktop apps over the cloud that bridges the gap between fast cloud-native apps (like Salesforce.com and Google Apps) and traditional desktopsoftware for which the only remote desktop and pixel streaming technology was available if you wanted to access it remotely. This seems like a fairly revolutionary technology. Application Jukebox, is supposed to allow users to check out an application (Photoshop, Autocad, Word, Excel, etc.), a whole OS and its full install of apps (Windows, Linux, etc.) and then use it with a single license. the next user can do the same after its been check back in. This is supposedly doable with something as simple as a smart phone or a tablet PC.
The trick: Cloudpaging doesn’t use “pixel-streaming” technology like desktop-sharing and virtualization apps already out there. Instead, it temporarily downloads bits of the application (instructions) and runs them on the client device.
Read more: http://www.businessinsider.com/were-blown-away-this-startup-could-literally-change-the-entire-software-industry-2012-3#ixzz1oN1aGNx0
Find out more: http://numecent.com
Tags: application jukebox, cloudpaging, desktop apps, numecent, remote desktop, virtualization
Posted in Software, Technology, Usability | No Comments »
by Dan Nicollet - February 15, 2012
MatchMaker, Exorbyte’s high-speed error-tolerant database search and data matching platform is used in a number of fields involving large-scale data matching:
- Process automation (BPM, BPA),
- Database search (eCommerce, Enterprise),
- Data quality (merging, enhancing, cleansing or data deduplicating).

The Miracle Data Matching Black Box
In all cases data matching is at the core of what’s being done. It’s where the most complex of the programmed algorithms come into play (string matching, edit distance, Levenshtein, soundex, specialized scripts, geo-proximity, phonetics, low-level semantics, multiple languages including Chinese and Arabic, etc.) However, this is not a one-size-fits-all world. There are as many use cases for MatchMaker. If you told an experienced data matching programmer, a DB admin, or a user of MatchMaker that one can build a universal data matching engine, they would probably burst out laughing and maybe even call you dellusional. Sorry but that’s just true.

MatchMaker Principles
Each data matching situation is extremely unique. However, I am always amazed at the number of users of competing products from major enterprise platform vendors (wish I could name them here but fire away in comments below!) that complain about the same thing: ”It was like a black box!” They report having been sold on the sophistication of a single matching engine stuffed with fancy proprietary algorithms that were meant to match health insurance claims, OCRed documents, or ecommerce search queries as easily as names in a phone book or hotel addresses in travel booking engine.
Well, that’s simply not right. It does make the job of the less sophisticated of enterprise software sales people easier sometimes to call a product “miracle” and to sell it to customers on that basis. But no matter how personable the peddler, disappointment always sets in. The only option left after that is build-it-yourself which may or may not work and be cost efficient.
MatchMaker is not a black box. It can be configured in an almost infinite number of ways, but it is also very transparent at the outset of matching operations by providing transparent ranking and scoring information to show how and why a given result was chosen by the engine.
Exorbyte MatchMaker can be tested and demonstrated. No black box at Exorbyte. We like our data matching transparent and highly configurable. We cannot solve all data matching challenges but we can try. When we are given a chance to do so, we seem to almost always come out with higher quality and performance than comparable systems (we have a few stories to tell you if you care to hear). That’s maybe because every version of Matchmaker since our first 1999 version was guided by the same four principles on the image to the right: efficiency, transparency, flexibility, universality.
Tell us your miracle black box stories below.
Tags: bi-gram, BPA, BPM, data matching, data matching engine, databases, duplicates, edit distance, false positives, informatica, levenstein, netrics, search, search software america
Posted in Banking, Exorbyte, Government, Industries, Insurance, Online Travel, Search, Software, Technology, data matching, eCommerce | No Comments »
by Dan Nicollet - February 1, 2012
It’s worth thinking about where you come from every once in a while. At Exorbyte, this means remembering that all of our products and services are built on MatchMaker, our unique high-speed fuzzy data matching platform.
Recently, MatchMaker was compared by two Fortune 50 clients with two of the very top data matching and data search platforms (one was acquired by Microsoft, the other by Oracle). It was expressly chosen over these competing alternatives because it was faster and more erro-tolerant (fuzzy).

MatchMaker Guiding Principles - Best Data Matching Server
So here is a quick reminder of why MatchMaker is unique and better:
MatchMaker is Exorbyte’s error-tolerant data search and data matching server. MatchMaker is a transparent and efficient data matching engine which offers the following key features:
- A fast in-memory database caching architecture enabling the fastest and most fuzzy data matching service possible.
- A flexible and highly configurable GUI environment for Windows and Unix platforms.
- A rich set of universal, language-independent string matching and data matching tools (Levenshtein edit distance, Soundex, alias, geo-proximity, and more).
- A rich set of pre-configured methods for specific use cases (string matching, words method, search, etc.)
- Support for parallel multi-server environments, splitting large data sets, distributing query load across multiple servers or MatchMaker instances.
- Benchmarking, testing tools and GUIs for tuning the quality and speed of the matching output.
- Transparent and configurable scoring and ranking of all matching and search results.
This doesn’t make MatchMaker a finished product. It is definitely a component building block for solutions in many different enterprise software applications (search, data quality, data management, BPA, BPM, etc.) However, in many of these cases MatchMaker replaces a slower engine and changes the cost/benefiot outcome so much that even as a simple component, it finds its way into the market in the largest IT organizations on-and-on.
Tags: alternatives, data matching, database, edit distance, endeca, fast, fuzzy, in-memory, informatica, levenshtein, mdex, netrics, soundex, tibco
Posted in Exorbyte, Search, Technology, data matching, databases | No Comments »
by Dan Nicollet - January 30, 2012
Exorbyte Commerce continues to grow with bigger customers joining the community every month. We have collected the most common questions and answers about Exorbyte Commerce. These are based on questions we get from potential customers and exiting users of our online store search platform. As you know using Exorbyte Commerce is not difficult, but we do offer many customization, reporting and tuning features which you may want to learn about. In our support area you will find
many answers regarding search, merchandizing, billing, integration, and technical features.
Tags: exorbyte commerce, FAQ
Posted in Exorbyte, Search, Software, eCommerce | No Comments »
by Dan Nicollet - November 28, 2011
The Open Source movement is moving into government data. Governments are finding a new source of untapped economic stimulus with the mountains of data they collect. The data is collected for the ultimate good of the public but rarely shared because information access was too people intensive and expensive up until recently. Things have changed.
![GOV_opendata[1] GOV opendata1 300x168 Open eGovernment Data](http://blog.exorbyte.com/wp-content/uploads/2011/11/GOV_opendata1-300x168.png)
ETALAB (France), data.gov.uk (UK), data.dc.gov (Washington, DC, US), whitehouse.gov/open (US), and countless other local and national governments have open their data coffers. In the case of DC for instance, the cost of publishing the data was $50K for the city. The DC government expected it to spur the creation of a few new ventures, and a bit of private investments. Instead, 50 startups were born and $3M invested. There is a world of open data coming to the private software industry.
Open Government data is also going to be Big Data. The size of data collected is by definition larger larger than traditional “enterprise data” for instance (especially at the national level). The tools being developed for big data will solve some of the issues with access and real time analytics that exit with government data. Exorbyte MatchMaker is one of these tools. That’s why government agencies have already chosen MatchMaker for their search and data access challenges (2 national European census agencies, German Finance Ministry, and more).
Are you ready for open government data? Any ideas what would make sense to build with this data?
Tags: big data, data, economic stimulus, egovernment, etalab, governement, open data, open source, software
Posted in Business, Government, Software, databases | No Comments »