conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the...
Transcript of conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the...
![Page 1: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/1.jpg)
![Page 2: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/2.jpg)
conta-me históriastemporal summarization and the Portuguese
web archive
Ricardo Campos, Arian Pasquali, Vítor Mangaravite,Alípio Jorge, Adam Jatowt
TPDL 2018
![Page 3: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/3.jpg)
about me
arian pasqualiResearcher
University of Porto and LIAAD - INESC TEC
![Page 4: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/4.jpg)
. filter bubbles
. information overload
. credibility crisis
. fake news and post-truth era
challenges
. more than ever web archives are crucial resources
. content based machine learning models (e.g. fact-checkers)
![Page 5: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/5.jpg)
importance of web archivesthe importance of preserving the history of the web
* Frank McCown, Catherine C. Marshall, and Michael L. Nelson. 2009. Why web sites are lost. Commun. ACM 52, 11 (November 2009), 141-145.
** Legal Information Archive, The chesapeake project report. 2008
~ 2% of the websites disappear every week *
~ 8% of urls disappear every year **
![Page 6: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/6.jpg)
Challenge Arquivo.pt Prizes
- Use public API to create innovative
ways to explore the web archive
https://sobre.arquivo.pt/en/collaborate/submissions-for-
the-arquivo-pt-prizes-2018/
troika em Portugal
2018Arquivo.pt announces a free text search API
![Page 7: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/7.jpg)
what if we could generate a timeline about any given topic automatically?
can we build a left-wing versus right-wing narrative timeline?
20002004
20082012
2016
idea
![Page 8: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/8.jpg)
timelines are a natural choice to explore long term stories
![Page 9: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/9.jpg)
generate a timeline summarization with relevant dates and headlines
given a search query and a timeframe
problem definition
20002004
20082012
2016
![Page 10: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/10.jpg)
steps
20002004
20082012
2016
1. query news articles using arquivo.pt API
2. identify relevant time frames
3. identify relevant headlines
![Page 11: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/11.jpg)
we apply principles of keyword extraction algorithms to calculate headlines
relevance from a particular time frame
20002004
20082012
2016
how to calculate headline relevance?
![Page 12: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/12.jpg)
- given a search query and a timeframe
- apply principles of keyword extraction to compute relevant headlines from a particular time frame
architecture
![Page 13: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/13.jpg)
news search
APIarquivo.pt
1.
architecture
![Page 14: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/14.jpg)
architecture
APIarquivo.pt
newssearch
APIarquivo.pt
1.
![Page 15: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/15.jpg)
compute term
weights
YAKE
2.
architecture
![Page 16: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/16.jpg)
2.
architecture
YAKE
compute term
weights
More informationCampos, R., & Mangaravite, V., & Pasquali, A., & Jorge, A., & Nunes, C., & Jatowt, A. (2018). A Text Feature Based Automatic Keyword Extraction Method for Single Documents. Best Short Paper of ECIR’18 (http://bit.ly/YakeDemoECIR2018)
where:
● TCase - Casing○ Reflects case aspect of a particular word
● TPos - Term position○ Relevant terms tend to occur at the beginning of the document
● TFreq - Term frequency○ Assumption that relevant words occur less
● TSent - term frequency on different headlines○ How often a word appears within different headlines
● TContent - Term relatedness to context○ Number of different terms that occur to the left (or right)○ the more the number of different words around the the candidate
word, more meaningless the word is likely to be
![Page 17: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/17.jpg)
architecture
3.identify relevant
time intervals
peak detection
![Page 18: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/18.jpg)
architecture
1. We aggregate count of news by date
2. We then detect peaks using scipy’s relative extrema function
3.identify relevant
time intervals
peak detection
from scipy.signal import argrelextrema
x = np.array([2, 1, 2, 3, 2, 0, 1, 0])argrelextrema(x, np.greater, order=1)
>> (array([3, 6]),)
![Page 19: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/19.jpg)
architecture
3.identify relevant
time intervals
peak detection
![Page 20: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/20.jpg)
architecture
3.identify relevant
time intervals
peak detection
![Page 21: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/21.jpg)
architecture
3.identify relevant
time intervals
peak detection
![Page 22: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/22.jpg)
architecture
3.identify relevant
time intervals
peak detection
![Page 23: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/23.jpg)
architecture
3.identify relevant
time intervals
peak detection
![Page 24: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/24.jpg)
architecture
3.identify relevant
time intervals
peak detection
![Page 25: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/25.jpg)
compute headline
score
YAKE
4.
architecture
![Page 26: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/26.jpg)
4.
architecture
YAKE
compute headline
score
Considering headlines from each time interval where:
● S(w) - term weight○ Considers weights from every term in the headline
● TF(kw) - Headline frequency○ How many times this headline occurs
● S(kw) - Final headline score
More informationCampos, R., & Mangaravite, V., & Pasquali, A., & Jorge, A., & Nunes, C., & Jatowt, A. (2018). A Text Feature Based Automatic Keyword Extraction Method for Single Documents. Best Short Paper of ECIR’18 (http://bit.ly/YakeDemoECIR2018)
![Page 27: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/27.jpg)
architecture
user interface
timeline
5.
![Page 28: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/28.jpg)
more detailed information at
A Text Feature Based Automatic Keyword Extraction Method for Single Documents * Campos, R., & Mangaravite, V., & Pasquali, A., & Jorge, A., & Nunes, C., & Jatowt, A. (2018).40th European Conference on Information Retrieval (ECIR'18).Grenoble, France. March 26 – 29. (Vol. 10772(2018), pp. 684 - 691).
YAKE! Collection-independent Automatic Keyword Extractor **Campos, R., & Mangaravite, V., & Pasquali, A., & Jorge, A., & Nunes, C., & Jatowt, A. (2018).40th European Conference on Information Retrieval (ECIR'18).Grenoble, France. March 26 – 29. (Vol. 10772(2018), pp. 806 - 810).
* ECIR 2018 Best Short Paper Award ** Demohttp://bit.ly/YakeDemoECIR2018
![Page 29: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/29.jpg)
contamehistorias.pt
![Page 30: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/30.jpg)
http://contamehistorias.pt
![Page 31: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/31.jpg)
![Page 32: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/32.jpg)
![Page 33: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/33.jpg)
troika em portugal
Narrative Timeline of Troika em Portugal during the last 10 years
![Page 34: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/34.jpg)
![Page 35: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/35.jpg)
![Page 36: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/36.jpg)
![Page 37: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/37.jpg)
![Page 38: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/38.jpg)
![Page 39: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/39.jpg)
![Page 40: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/40.jpg)
![Page 41: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/41.jpg)
![Page 42: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/42.jpg)
![Page 43: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/43.jpg)
![Page 44: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/44.jpg)
![Page 45: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/45.jpg)
![Page 46: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/46.jpg)
research edition with different datasets
![Page 47: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/47.jpg)
http://nlplab.inesctec.pt/contamehistorias-labs
![Page 48: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/48.jpg)
facebook posts edition
![Page 49: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/49.jpg)
http://nlplab.inesctec.pt/contamehistorias-facebook
![Page 50: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/50.jpg)
Conta-me Histórias open source open source python package
![Page 51: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/51.jpg)
http://github.com/LIAAD/TemporalSummarizationFramework
![Page 52: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/52.jpg)
Installingrequires Python 3
![Page 53: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/53.jpg)
Using terminal interfaceAccessing client help
![Page 54: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/54.jpg)
Code with Arquivo.pt as data source
![Page 55: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/55.jpg)
How to extend it You can extend it to use your own collection as data source.
Extend BaseDataSource class and implement method getResult to return a list of object ResultHeadLine.For each document you need a title, a timestamp, source and url.
![Page 56: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/56.jpg)
Extension example with custom dataset *
* Signal’s NewsIR’16 dataset : http://research.signalmedia.co/newsir16/signal-dataset.html
![Page 57: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/57.jpg)
Output
![Page 58: conta-me histórias - sobre.arquivo.pt · conta-me histórias temporal summarization and the Portuguese web archive Ricardo Campos, Arian Pasquali, Vítor Mangaravite, Alípio Jorge,](https://reader035.fdocuments.us/reader035/viewer/2022070809/5f0840177e708231d4211451/html5/thumbnails/58.jpg)
● In terms of data sources support different web archives
● Different timelines for different kinds of sources (left-wing, right-wing, tabloids, etc)
● Extended formal evaluation●
20002004
20082012
2016
next steps