INDEX
Explanations
instances of the word "almost" followed by a number
references to the word "almost" in various contexts
New Auto-Interp
Negative Logits
oran
-0.79
agate
-0.79
oris
-0.73
RTX
-0.71
Ds
-0.70
eria
-0.66
alam
-0.65
oÄŁ
-0.65
Ey
-0.64
erion
-0.63
POSITIVE LOGITS
etheless
0.79
certainly
0.74
exclusively
0.73
identical
0.71
stress
0.69
mundane
0.67
zero
0.66
ident
0.63
electr
0.63
untarily
0.63
Activations Density 0.032%