INDEX
Explanations
proper names and titles
prominent names or titles related to individuals
New Auto-Interp
Negative Logits
ModLoader
-0.66
hindsight
-0.62
clubhouse
-0.61
LEASE
-0.57
ãĤ¨ãĥ«
-0.52
symbolism
-0.52
interchangeable
-0.51
Wolves
-0.51
equivalents
-0.51
undecided
-0.51
POSITIVE LOGITS
inski
0.87
ovich
0.83
mann
0.82
enberg
0.78
stad
0.78
auer
0.77
ich
0.76
ensen
0.76
endor
0.75
(@
0.73
Activations Density 0.765%