INDEX
Explanations
references to specific or proper nouns
New Auto-Interp
Negative Logits
İĭ
-0.72
kie
-0.68
plateau
-0.66
ality
-0.62
sack
-0.60
aido
-0.59
hust
-0.59
ists
-0.59
lawn
-0.57
lance
-0.57
POSITIVE LOGITS
LOG
0.80
EMOTE
0.80
76561
0.77
SAY
0.71
Computer
0.71
GAME
0.71
Transcript
0.70
Console
0.70
Ep
0.70
liquid
0.69
Activations Density 0.023%