Having made 42 online trademark databases available to the industry on the search management tool Inspiro and on its website, Avantiq has proved its experience in the trademark database field. Larissa Best explains.
Forty-two trademark databases mean 38 million trademark documents, with more than 15 million logos. All of that data makes up 716 gigabytes of data on Avantiq’s servers. In order to maintain the data fresh and ready to use, we process 4,500 updates per year.
With the growing number of trademark applications worldwide, the task of updating the data has become increasingly time-consuming. In 2011 alone, we updated 13 million trademarks.
To be efficient, we need a strong and eff ective team. For all of this data, we have just six people who do all the work: three IT experts dedicated to database developments and improvements; two operators in charge of updating and maintaining the databases and one offi cer who constantly checks and improves our data quality.
The usual development process for a database
1. Acquisition: It is important to find the right partner in order to get the highest standard of quality. The local patent and trademark offi ce (PTO) would be the fi rst point of contact but may not have any data in an electronic format or may not be interested in selling the data.
The second instance is getting in contact with local service providers and local law firms. After having found the possible partners, you need to exchange data in order to analyse the data quality as well as the data structure. If the samples are adequate and the quality is good, the contract negotiations begin and a provider is chosen. Timeframe: 1 to 6 months
2. Development: When the data is received, the IT expert will run the fi rst quality checks on the complete set of data. He or she will do a complete analysis of the data and then start the conversion process into the internally used format. A second check is done aft er the conversion to make sure that none of the quality was lost. Timeframe: 1 to 6 months
3. Testing: The database now becomes available internally to the operations department who will run cross-checks between the official PTO data (if available), the data gathered from local agents and the data available from other service providers (if available). Timeframe: 2 weeks to 1 month
4. Integration: When the database has passed all the checks and verifications, the data is made publically available on our web platformInspiro and the quick search facility on avantiq.com. Marketing then integrates the new dataset into the existing offers and starts its usual marketing cycle. Timeframe: 2 weeks
How to update the data
As every data provider has its own format, the update process is individually tailored to each jurisdiction. There are five possible ways to receive the update data:
"After having found the possible partners, you need to exchange data in order to analyse the data quality as well as the data structure."
Thereare also five different types of data: XML, CSV, Flat File, Web Service Object and HTML. There are thus 25 different ways to receive the data from a provider, each needing individual attention.
As is the case when development for a new database is done, the update is a five-phase process.
1. Reception of the source data;
2. Validation of the format;
3. Conversion into a structured document and validation of content;
4. Retrieval of the current version of the document and updating the document; and
5. Validation of the new data.
The main fields that need to be updated in a trademark document
How to ensure database quality
There are four phases when database quality needs to be guaranteed:
1. Development: Check the number of trademarks available with the PTO or other providers. Check that all fields are consistent, eg, a mark categorised as a device mark should have a logo.
2. Update: Keep statistics about the number of trademarks per update and check whether the new update is within a similar range.
3. Test: The data is reviewed by operations and sporadic tests are done with other sources.
4. Post-integration: Suspect cases are reported and directly dealt with. Root cause analyses are done for every case reported.
The importance of quality
Here are a couple of fun issues that were detected from our side and manually corrected:Trademark name according to PTO: figurative
Manually corrected trademark name: Kawaii Zombies Cure ’em all
These examples show that data quality is extremely important—only correct data can be a good basis upon which to build a high standard trademark service.
Larissa Best is director of strategic relations and marketing at Avantiq. She can be contacted at: email@example.com
This article was first published on 01 September 2012 in World IP Review
database, trademark, pto data, data