INDEX
Explanations
patterns in repetitions of symbols
special characters or repeated symbols in the text
New Auto-Interp
Negative Logits
Writ
-0.64
Deb
-0.59
veget
-0.57
AUD
-0.56
sponsoring
-0.55
orchestr
-0.55
supportive
-0.55
intent
-0.54
lav
-0.54
interpret
-0.54
POSITIVE LOGITS
istg
0.77
..............
0.71
]);
0.69
train
0.65
]).
0.63
Train
0.62
................
0.61
..
0.61
......
0.61
Frieza
0.61
Activations Density 0.202%