INDEX
Explanations
content related to relationships and familial connections
New Auto-Interp
Negative Logits
Obvious
-1.00
ovviamente
-0.96
seems
-0.91
obvious
-0.89
obviously
-0.88
wydaje
-0.88
natuurlijk
-0.88
reminder
-0.87
reportedly
-0.86
кажется
-0.86
POSITIVE LOGITS
indeed
1.26
indeed
1.01
efectivamente
1.01
inderdaad
0.87
وأن
0.77
للمعارف
0.75
tatsächlich
0.74
actually
0.74
Indeed
0.72
effectivement
0.67
Activations Density 1.297%