INDEX
Explanations
specific phrases that indicate exceptions, comparisons, or critical conditions
New Auto-Interp
Negative Logits
zig
-0.15
irc
-0.15
Assertions
-0.15
iges
-0.14
yc
-0.14
á»Ńa
-0.14
uento
-0.14
ause
-0.14
ÅĤa
-0.14
laughter
-0.13
POSITIVE LOGITS
physical
0.25
Physical
0.21
physical
0.21
aspect
0.19
Physical
0.17
regarding
0.17
od
0.17
fÃŃs
0.16
phis
0.16
hardware
0.16
Activations Density 0.150%