INDEX
Explanations
letters and symbols occurring in patterns
instances of unique or unexpected punctuation and special characters
New Auto-Interp
Negative Logits
hement
-0.72
isconsin
-0.63
abwe
-0.63
akings
-0.62
footing
-0.61
anes
-0.59
acan
-0.58
emark
-0.57
elimination
-0.57
isons
-0.56
POSITIVE LOGITS
--------------------------------------------------------
0.92
Reward
0.89
à¼
0.89
Synopsis
0.88
Plug
0.85
Relations
0.82
âĶĢâĶĢâĶĢâĶĢ
0.82
ãģı
0.81
ÙĦ
0.80
Norn
0.79
Activations Density 0.269%