INDEX
Explanations
the word "con" and its variations, suggesting it tracks contexts of consensus or contention in discussions
New Auto-Interp
Negative Logits
myſelf
-0.90
ſelves
-0.79
ſelf
-0.77
themſelves
-0.74
Eſ
-0.74
itſelf
-0.70
poffible
-0.69
leaſt
-0.66
pleaſure
-0.66
neſs
-0.64
POSITIVE LOGITS
con
2.87
com
0.82
CON
0.79
coi
0.70
Con
0.68
cons
0.68
с
0.68
autorytatywna
0.67
cun
0.65
----</
0.64
Activations Density 0.072%