INDEX
Explanations
words related to importance or essentiality
references to essential or crucial concepts
New Auto-Interp
Negative Logits
AUT
-0.79
©¶æ
-0.73
annis
-0.72
renheit
-0.69
sein
-0.69
Ń·
-0.68
hang
-0.67
ften
-0.66
ATT
-0.66
Sett
-0.65
POSITIVE LOGITS
organs
0.90
vital
0.90
destro
0.85
ogical
0.80
istically
0.80
rament
0.79
itational
0.77
arial
0.75
emic
0.74
istical
0.74
Activations Density 0.006%