Building matrices and normalization. In order to normalize co-occurences you will need first to...

Post on 03-Jan-2016

212 views 0 download

Transcript of Building matrices and normalization. In order to normalize co-occurences you will need first to...

Building matrices and normalization

In order to normalize co-occurences you will need first to build a matrix with units (words, cited authors etc) in the columns and document numbers in the rows. BibExcel will fill the matrix with numbers and then you could calculate Salton’s or Jaccard Index.

Make a co-word analysis based on the ID-field. The low file has a nicer look than the out-file after running Edit out-files/Convert Upper Lower Case/Good for reference strings on the outfile

Calculate frequencies on the low-file and the cit-file looks like this

Select the most frequent units, down to frequencies=20, sort them in Excel and then paste them into The List. Then select the low-file containing the id-words, and then run Analyze/Docs and units matrix/Make docnr+units matrix without zero row sum.

The ma5-file now contains the matrix!

To calculate Salton’s index select the ma5-file and run Analyze/Docs and units/Calculate Salton cosine from a ma5-file

Answer Yes (Ja) to this question:

Answer No (Nej) to this question:

…and the result is in the sal-file, with Salton index values, multiplied by 1000 (good for some applications)

Instead of Salton you may choose Jaccard or Vladutz & Cook normalization and apply them to the ma5-file.