News
11mon
Tech Xplore on MSNDiscrete-time rewards efficiently guide the extraction of continuous-time optimal control policy from system dataThe expression of rewards largely represents the perception of the system and defines the behavioral ... learning algorithms ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results