INDEX
Explanations
phrases related to boolean values (true or false)
statements about truth and falsehood
New Auto-Interp
Negative Logits
approach
-0.73
intern
-0.72
licensing
-0.72
fees
-0.71
disposal
-0.70
cost
-0.70
review
-0.67
overhaul
-0.67
supervision
-0.66
announcement
-0.66
POSITIVE LOGITS
true
3.57
false
3.55
False
2.02
True
1.91
null
1.90
truth
1.55
nil
1.18
ALSE
1.16
reality
1.14
successful
1.11
Activations Density 0.015%