INDEX
Explanations
specific occurrences of the word "this"
New Auto-Interp
Negative Logits
ickets
-0.82
letes
-0.81
acers
-0.81
apses
-0.76
stars
-0.74
onis
-0.72
inx
-0.70
acer
-0.69
planes
-0.69
winning
-0.69
POSITIVE LOGITS
regard
1.29
vein
1.16
context
1.07
manner
1.03
case
1.00
vicinity
0.99
circumstance
0.99
particular
0.96
situation
0.95
guise
0.93
Activations Density 0.046%