Skip to content

[BUG] Incompatible with gymnasium's NormalizeObservation wrapper due to missing "num_envs", "is_vector_env" and "single_observation_space" attributes #258

@CloudyDory

Description

@CloudyDory

Describe the bug

It seems that envpool's vectorized environment is not compatible with gymnasium's NormalizeObservation wrapper due to missing "num_envs", "is_vector_env" and "single_observation_space" attributes in the environments returned by envpool.

Here is the code for gymnasium's normalization wrapper:

class NormalizeObservation(gym.Wrapper):
    """This wrapper will normalize observations s.t. each coordinate is centered with unit variance.

    Note:
        The normalization depends on past trajectories and observations will not be normalized correctly if the wrapper was
        newly instantiated or the policy was changed recently.
    """

    def __init__(self, env: gym.Env, epsilon: float = 1e-8):
        """This wrapper will normalize observations s.t. each coordinate is centered with unit variance.

        Args:
            env (Env): The environment to apply the wrapper
            epsilon: A stability parameter that is used when scaling the observations.
        """
        super().__init__(env)
        self.num_envs = getattr(env, "num_envs", 1)
        self.is_vector_env = getattr(env, "is_vector_env", False)
        if self.is_vector_env:
            self.obs_rms = RunningMeanStd(shape=self.single_observation_space.shape)
        else:
            self.obs_rms = RunningMeanStd(shape=self.observation_space.shape)
        self.epsilon = epsilon

    def step(self, action):
        """Steps through the environment and normalizes the observation."""
        obs, rews, terminateds, truncateds, infos = self.env.step(action)
        if self.is_vector_env:
            obs = self.normalize(obs)
        else:
            obs = self.normalize(np.array([obs]))[0]
        return obs, rews, terminateds, truncateds, infos

    def reset(self, **kwargs):
        """Resets the environment and normalizes the observation."""
        obs, info = self.env.reset(**kwargs)

        if self.is_vector_env:
            return self.normalize(obs), info
        else:
            return self.normalize(np.array([obs]))[0], info

    def normalize(self, obs):
        """Normalises the observation using the running mean and variance of the observations."""
        self.obs_rms.update(obs)
        return (obs - self.obs_rms.mean) / np.sqrt(self.obs_rms.var + self.epsilon)

The init() function needs the above three attributes to work correctly, but the environment object returned by envpool does not have them, causing the init() function to use default values, which are not correct.

To Reproduce

import gymnasium as gym
import envpool
train_envs = envpool.make('Breakout-v5',  env_type='gymnasium',  num_envs=8)
train_envs = gym.wrappers.NormalizeObservation(train_envs)
print(train_envs.num_envs)
print(train_envs.is_vector_env)

Expected behavior

train_envs.num_envs = 8
train_envs.is_vector_env = True

Actual behavior

train_envs.num_envs = 1
train_envs.is_vector_env = False

Screenshots

No screenshots.

System info

Describe the characteristic of your environment:

  • Describe how the library was installed (pip, source, ...)
  • Python version
  • Versions of any other relevant libraries
import envpool, numpy, sys
print(envpool.__version__, numpy.__version__, sys.version, sys.platform)

0.8.2 1.23.5 3.10.10 (main, Mar 21 2023, 18:45:11) [GCC 11.2.0] linux

Additional context

No context.

Reason and Possible fixes

If you know or suspect the reason for this bug, paste the code lines and suggest modifications.

Checklist

  • I have checked that there is no similar issue in the repo (required)
  • I have read the documentation (required)
  • I have provided a minimal working example to reproduce the bug (required)

Metadata

Metadata

Assignees

Labels

questionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions