INDEX
Explanations
the word "interested", and perhaps words related to it
interested
New Auto-Interp
Negative Logits
<bos>
-1.21
#
-0.71
д
-0.63
n
-0.63
o
-0.61
there
-0.57
I
-0.57
l
-0.56
cade
-0.56
we
-0.55
POSITIVE LOGITS
Efq
0.94
-------
0.91
Jefus
0.90
enfans
0.88
bibfield
0.87
Chrift
0.86
Monfieur
0.85
Majefty
0.85
ordinaires
0.84
religieux
0.83
Activations Density 1.551%