INDEX
Explanations
food-related terms or dishes
Non-English language fragments and code snippets intermixed
complex nouns and abbreviations
New Auto-Interp
Negative Logits
-0.62
im
-0.52
j
-0.50
est
-0.49
for
-0.49
Re
-0.47
ון
-0.47
A
-0.46
Javadoc
-0.46
g
-0.46
POSITIVE LOGITS
ſtate
0.96
myſelf
0.92
purpoſe
0.89
Jefus
0.87
ftate
0.86
ſmall
0.85
ſelf
0.83
houſe
0.83
ſever
0.83
perſon
0.82
Activations Density 0.031%