INDEX
Explanations
words describing a lack of clarity or vagueness
New Auto-Interp
Negative Logits
isa
-0.17
Goth
-0.15
avn
-0.15
lew
-0.15
antis
-0.15
æµģéĩı
-0.15
avez
-0.14
anean
-0.14
èĤ¥
-0.14
-toggler
-0.14
POSITIVE LOGITS
cape
0.17
(
0.16
atem
0.16
Hin
0.15
Ridley
0.14
óst
0.14
wand
0.14
Birch
0.14
insula
0.13
Dense
0.13
Activations Density 0.007%