Artificial Intelligence - can it deliver improved patent …• Huge changes in the technology...
Transcript of Artificial Intelligence - can it deliver improved patent …• Huge changes in the technology...
Artificial Intelligence -can it deliver improved patent search?
Stephen Adams,Magister Ltd./Former PIUG Vice-Chair
www.magister.co.uk
Magister ® is a registered trade mark of Magister Ltd. in the United Kingdom
© Patent Information Users Group, Inc.2
What is PIUG?Non-profit organization for individuals with professional, scientific or technical interest in patent information.
Mission Statement• To support, assist, improve and enhance the success of patent information
professionals through leadership, education, communication, advocacy and networking.
• Promote and improve the retrieval, analysis and dissemination of patent information.
Patent Information Users Group, Inc. The International Society for Patent Information Professionals
© Patent Information Users Group, Inc.3
Brief History – Key Events
1988 1992 1998 1999 2002 2007 2008 2009 2010 2011 2012 2013 2015
First Northeast regional
Conference
First Multi-day National
(Annual) Conference
First Biotechnology Conference
2016
PIUG was born
PIUG incorporated; formal elections to Board of Directors
PIUG Education & Training Task Force created
First non-US Chair elected
Start of PIUG Brian Stockdale
Award
Start of PIUG Stu Kaback Business Impact Award
CEPIUG partnership
WIPO, PDG partnerships
Partnerships with AIIP, LES and
AALL
PIUG granted WIPO Observer Status ;
contributes to WIPO technical committees
A full list of PIUG Milestones is available
© Patent Information Users Group, Inc.4
PIUG Members – by Affiliation Type(approximate ratios, 2019)
Corporate 38%
Suppliers22%
Consulting21%
Legal practice9%
Other10%
© Patent Information Users Group, Inc. 5
PIUG Members – by Country(approximate ratios, 2019)
USA = 374 Members
~ 72 %US Members = 38472%
324 US Members~ 68%
USA68%
India5%
United Kingdom5%
China3%
Canada3%
Germany2%
Other (AT, AU, CH, IL, JP, NL...)
14%
© Patent Information Users Group, Inc. 6
• Users ask “Does anyone know…?” and engage in “what works, what doesn’t, tips on how to fix problems”
• Users directly contribute to Forums, etc. • Patent Offices communicate proposals and activities• Vendors present products and services information • Employers post job openings to target audience• Knowledge base of useful patent information• “Members-Only” site
PIUG Wiki Forum;Learning & Networking Opportunities
Context of my talk
© Magister Ltd. 2019 7
• Huge changes in the technology associated with searching electronic information, since my first Dialog ® search in 1981.
• Strong advocates and strong sceptics about the impact of AI on the future workplace.
• Most references to “AI and search” are in fact about data mining or pattern matching• My talk will focus on patentability
searching, not on applications in patent analytics (e.g. state of the art reviews, patent statistics and strategic decisions)
Demand for AI solutions (I): cost-cutting at IPOs and industry
© Magister Ltd. 2019 8
“Artificial Intelligence is the next big thing in the legal profession,… corporate clients have become increasingly cost-conscious … refusing to pay for the hours spent on research…”
Demand for AI solutions (II): shortage of expertise
© Magister Ltd. 2019 9
“Our Story: ….outsourced search experts are unreliable…We have the skills to solve the problem..”
The demand for quality search has outstripped the supply of qualified people to do the searches.
Demand for AI solutions (III): “because it is there….”
© Magister Ltd. 2019 10
Helmers L, Horn F, Biegler F, Oppermann T, Müller KR (2019) Automating the search for a patent’s prior art with a full text similarity search. PLOS ONE 14(3): e0212103. https://doi.org/10.1371/journal.pone.0212103
Artificial Intelligence; the next logical step at work and in life?
© Magister Ltd. 2019 11
Pipeline crawlers: operating in workplaces that humans cannot enter.
Robot assembly lines: replacing human manual skills.
Automated health assessments: replacing intellectual skills.
The history of automation has often caused concern and fear.
© Magister Ltd. 2019 12
“Consider what the invention could do to my poor subjects. It would assuredly bring to them ruin by depriving them of employment, thus making them beggars.” (Queen Elizabeth I)
William Lee (1563-1614) was refused a patent in 1589 for his frame knitting machine by Elizabeth I.
Coat of Arms of the Worshipful Company of Framework Knitters.
Is this just an old patent searcher’s reaction to ‘new technology’?
© Magister Ltd. 2019 13
+ +
sabot(French = wooden shoe)
= sabotage(destruction of an employer’s property…by discontented workers)
Public Domain, https://commons.wikimedia.org/w/index.php?curid=4150391
Anxious Employees Enforced automation
Current AI systems typically focus on thesame objective within a defined environment.
© Magister Ltd. 2019 14
Question: Can AI handle searches with differentobjectives in a global,dynamic environment?
What skills does a HUMAN patent information professional need?
© Magister Ltd. 2019 15
TI
BP
Information science
Technical field
Patent laws and procedures
Business awareness
In addition to skills, you need TIME!
© Magister Ltd. 2019 16
TI
BP
TI
BP
TI
BP
TI
BP
TI
BP
TI
BP
TI
BP
Merely adding many years of work in the same job does not necessarily make you a better searcher.
Human learning is not exactly like machine learning….
© Magister Ltd. 2019 17
TI
BP
E
E = EXPERIENCE; surrounding and expanding upon all other skills.
“Experience is what you get, when you didn’t get what you wanted”.
The factor which separates an experienced searcher from a goodone is their ability to learn when their existing skills break down.
attrib. Randy Pausch/Dan Stanford
Will AI deliver improved patent search?
• What are our overall ‘success criteria’ ?– AI search machines supporting or replacing human
searchers?– AI search results matching a beginner, an average or
a skilled human searcher?– AI search systems operating in all types of patent
search, or only some types (e.g. statistical studies, patinformatics)?
• My success criteria for the purposes of this talk– Improvement = at least as good as a skilled searcher
in patentability search, in all fields of technology.
© Magister Ltd. 2019 18
The A-B-C of evaluating search systems
• Skilled users have a right to expect that any new search system (AI or not) can provide satisfactory responses to three key challenges:– Is the search AGNOSTIC?– Is the search BIASED?– Is the search CONSISTENT?
© Magister Ltd. 2019 19
Challenge 1: ‘Agnostic’ search
• ‘Agnostic’ = ‘not holding a strong opinion one way or the other’ (neutral)
• The search process and the manner of presenting the results (relevance ranking) must not be influenced by earlier searches– each search is a unique, stand-alone ‘product’
• My challenge:– current AI-based systems appear to be set up to
remember too much, and not under the control of the human user.
– useful for some searches, but not all search types.© Magister Ltd. 2019 20
Selection and relevance depends upon context and objective of search• The same patent reference may be
– Completely relevant• Total anticipation of subject matter in a patentability search.
– Partially relevant• At the fringes of the subject matter in a state-of-the-art
review• A possible inventive step objection in invalidation or
revocation proceedings.
– Not relevant at all• In force in the wrong jurisdiction for a freedom to operate
search• Used in a different industry than our state of the art review
target.
© Magister Ltd. 2019 21
√
≈
x
Search technology and filter bubbles.
© Magister Ltd. 2019 22
“Searching online [on the internet] is more efficient and … puts researchers in touch with prevailing opinion, but this may … narrow the range of findings … ”
Evans, James A. “Electronic Publication and the Narrowing of Science and Scholarship.”
Science 321(5887) (18 Jul 2008) : 395-399
A filter bubble is a state of intellectual isolation … when a website algorithm selectively guesses what information a user would like to see … As a result, users become separated from information that disagrees with their viewpoints, effectively isolating them in their own cultural or ideological bubbles.
Wikipedia, “Filter Bubble” [Accessed 2018.09.05]
But surely that’s not a problem in science and technology search?
AI-based search systems must be ‘source neutral’ – every time.
© Magister Ltd. 2019 23
‘But the Microsoft co-founder also believes that the filter-bubble problem will self-correct over time... Education is a counterbalance to filter bubbles, says Gates, since it exposes people “to a common base of knowledge.” ’
Why do patent searchers need ‘agnostic search’?
© Magister Ltd. 2019 24
‘Trending’ searches Irrelevant
suggestionsfor‘Tokyo PIFC’
Potentially misleadingsuggestions – one letter different!
Challenge 2: ‘Biased’ search
• Where did you (the human searcher) learn your search skills?– how many of your colleague’s bad habits did you
learn?• misunderstanding and bias can be perpetuated across many
generations of trainees.
• AI-based search systems also need a ‘learning set’ before they can ‘understand text’– what happens if the learning set is incomplete or
inaccurate?
© Magister Ltd. 2019 25
Biased learning set → biased search
© Magister Ltd. 2019 26
Biased learning set → biased search
© Magister Ltd. 2019 27
“Black box”“Most search reports contain <5% NPL”
“Ignore NPL”
“Most USPTO citations are in English”
“Ignore Japanese references”
Can AI learn to be selective about the past?
• An expert human searcher accumulates a store of experience over time– AND knows when to deploy which part of
that experience for a specific search.
• Can AI learn to deploy some of its experience and ignore other parts of it?
© Magister Ltd. 2019 28
TI
BP
E
What information is in the learning set?
• Most information retrieval research so far has concentrated on TEXT– but patents are not all text; they use other ‘languages’ to
describe inventions:• drawings, tables, figures, chemical/biological structures, circuit
diagrams….
• Mihai Lupu, researcher in IR, Vienna– “Words [in patents] are pretty useless…but it’s all we have.”– words are usually imperfect (and sometimes deliberately
imprecise) descriptions of the actual invention.• for the skilled patent searcher, words are NOT “all that we
have”: other metadata are available in alternative sources of information.
© Magister Ltd. 2019 29
What inputs do AI systems prefer?
• “A known relevant patent (text or number)”– but which part is ‘most relevant’ and why ?
• only a small aspect of the entire patent may be the feature we are searching for.
• “A known relevant paragraph or sub-section (text)”– what you gain by pinpointing the relevant feature,
you may lose by not providing the wider context for the search
• retrieve more noise?
© Magister Ltd. 2019 30
Patents use many different ‘languages’ –can AI be taught to understand them?
© Magister Ltd. 2019 31
Current search processes are little better – most ‘non-text’ still needs human interpretation.
• The Committee on WIPO Standards (CWS) has set up a 3D Task Force to consider recommending standards for 3D objects in IP applications (especially trade marks)– How will they be (i) represented, (ii) stored, (iii) examined, and
(iv) searched?
• This is not a new problem e.g.:– Stephen Adams. Electronic non-text material in patent
applications – some questions for patent offices, applicants and searchers. World Patent Information, 27(2), 2005, 99-103.
– Stefanos Vrochidis et al. Concept-based patent image retrieval. World Patent Information, 34(4), 2012, 292-303.
© Magister Ltd. 2019 32
Challenge 3: ‘Consistent’ search• AI search engine developers seem to assume that
– search localization is always a good thing,– new searches replace, not supplement, older ones, – there is limited need to reproduce past work or archive past
search strategies.
• Patentability searches in industry may have differing requirements– all searchers using the same strategy get the same results,– need to conduct ‘top-up’ searches during the 18-month pre-
publication period,– archiving past search strategies, as proof of best practice /
competence (litigation).
© Magister Ltd. 2019 33
Search localization is common in some general search engines.
© Magister Ltd. 2019 34
google.com
google.co.uk
google.de
“We use your country and location to deliver content relevant for your area..”https://www.google.com/search/howsearchworks/algorithms/
google.co.jpgoogle.com.br
Impact of AI on top-up searches
© Magister Ltd. 2019 35
Answerset A
t
Answerset A
Answer set B; set B – A = new answers only.
Boolean search engine
Strategy 1; time 1
Strategy 1; time 2
Answerset A
Answerset A
Answer set B; set B – A = some new answers ±some old answers.
“Learning” search engine
Archiving search strategies
• At present, both patent offices and industrial searchers can store meaningful strategies– patent office strategies help third parties to
understand what prior art has been considered, and why.
– industrial strategies may be produced in evidence in litigation.
• Strategies applied to non-Boolean systems are more difficult to archive, and may not give the same results if used later– search mechanism is hidden, not human-readable.
© Magister Ltd. 2019 36
Archiving search strategies
© Magister Ltd. 2019 37
How will the search professional react to AI search systems?
© Magister Ltd. 2019 38
Professional search system evaluation:
Is the search AGNOSTIC?Is the search BIASED?Is the search CONSISTENT?
OR