INDEX
Explanations
references to online communities and their rules
New Auto-Interp
Negative Logits
CanadaChoose
-0.52
évaluateur
-0.43
tvguidetime
-0.43
SharedCtor
-0.41
原始内容存档于
-0.41
Administrativna
-0.38
Hozzáférés
-0.38
aarrggbb
-0.37
Errorf
-0.36
😂
-0.36
POSITIVE LOGITS
anon
0.68
Anon
0.61
Trips
0.60
Anon
0.59
Kek
0.58
kek
0.58
faggot
0.58
Trips
0.57
>>
0.57
>>
0.57
Activations Density 0.277%