INDEX
Explanations
New Auto-Interp
Negative Logits
myſelf
-0.86
دانشنامهٔ
-0.85
ValueStyle
-0.85
fubject
-0.79
itſelf
-0.79
ainfi
-0.75
ſtate
-0.75
tvguidetime
-0.75
ſche
-0.74
himſelf
-0.73
POSITIVE LOGITS
भ
0.51
long
0.51
по
0.50
vő
0.46
ge
0.46
Booth
0.45
chas
0.45
esp
0.44
biến
0.44
hand
0.44
Activations Density 0.527%