INDEX
Explanations
concepts related to rationality and logical reasoning
New Auto-Interp
Negative Logits
nahilalakip
-0.76
anonymity
-0.64
borgen
-0.61
sunter
-0.58
kasarigan
-0.57
jspb
-0.56
Pautan
-0.56
eckt
-0.55
Complaints
-0.55
Millard
-0.54
POSITIVE LOGITS
racional
1.06
rational
0.99
Vernunft
0.92
reasoning
0.92
rationality
0.89
rationally
0.86
logical
0.81
Rational
0.81
sagesse
0.80
logic
0.80
Activations Density 0.299%