INDEX
Explanations
references to the word "Char" followed by a digit
mentions of a specific brand or character
New Auto-Interp
Negative Logits
MW
-0.87
benchmark
-0.70
ively
-0.70
udo
-0.67
prem
-0.67
OIL
-0.66
uckland
-0.65
Bench
-0.65
wave
-0.65
peak
-0.64
POSITIVE LOGITS
Char
3.88
Char
2.79
char
1.66
Charm
1.62
char
1.61
CHAR
1.52
Chau
1.50
Charity
1.39
Chr
1.28
Cha
1.22
Activations Density 0.011%