INDEX
Explanations
words ending in 's'
the word "is" in various contexts
New Auto-Interp
Negative Logits
izons
-0.62
irtual
-0.62
luaj
-0.60
Juda
-0.60
fee
-0.59
inav
-0.59
erning
-0.58
orthy
-0.55
farious
-0.55
icz
-0.55
POSITIVE LOGITS
why
0.92
gonna
0.82
happening
0.75
preferable
0.71
okay
0.71
alright
0.71
understandable
0.71
impossible
0.70
worth
0.69
ername
0.69
Activations Density 0.039%