INDEX
Explanations
unusual words or phrases within the text
New Auto-Interp
Negative Logits
hyde
-0.80
street
-0.79
zig
-0.76
wedge
-0.76
Boe
-0.73
Xer
-0.71
tics
-0.71
terson
-0.70
jay
-0.69
Wy
-0.69
POSITIVE LOGITS
atch
1.09
ailability
0.99
POSE
0.94
Mahjong
0.92
ortunately
0.89
essim
0.87
Applic
0.86
itely
0.84
ATCH
0.84
unbeliev
0.83
Activations Density 0.618%