INDEX
Explanations
references to Sky, particularly in relation to sports and news
New Auto-Interp
Negative Logits
och
-0.16
Pale
-0.15
uv
-0.15
ucci
-0.15
azo
-0.14
Justice
-0.14
icht
-0.14
powerful
-0.14
staining
-0.14
isto
-0.13
POSITIVE LOGITS
/Instruction
0.17
lec
0.17
rim
0.17
linik
0.16
custody
0.16
plusplus
0.16
LENG
0.15
thouse
0.15
-REAL
0.14
íĨłíĨł
0.14
Activations Density 0.011%