INDEX
Explanations
states, actions, and sensory details
New Auto-Interp
Negative Logits
일반적으로
1.00
ANY
0.88
ANYTHING
0.88
कोणत्याही
0.88
généralement
0.86
任何
0.85
任何人
0.83
ANY
0.83
যেকোনো
0.83
anything
0.82
POSITIVE LOGITS
alike
0.76
upon
0.72
Upon
0.72
Upon
0.72
beneath
0.70
beside
0.69
ablaze
0.67
Begegn
0.66
にて
0.64
salute
0.64
Activations Density 0.316%