INDEX
Explanations
mentions of organizations and their actions, especially in a critical context
numbers and punctuation
New Auto-Interp
Negative Logits
TypedDataSet
-0.52
creș
-0.47
și
-0.46
enggak
-0.45
theaters
-0.44
ști
-0.44
ști
-0.42
theater
-0.42
Și
-0.41
și
-0.40
POSITIVE LOGITS
COLOUR
0.75
colourful
0.75
favourite
0.75
Behaviour
0.74
favourite
0.73
Neighbour
0.73
Colour
0.71
randomised
0.71
tumour
0.70
colourful
0.69
Activations Density 0.154%