Skip to content

Conversation

@mlangsdorf
Copy link
Contributor

@mlangsdorf mlangsdorf commented Jul 27, 2017

group_manifests: create plugin for group manifests

Add a group_manifests plugin. For now, it fails if asked to create
a group of manifests.

If asked to create a single manifest, it pulls the x86_64 image manifest
from a repository passed as an argument. It then re-uploads the image
manifest to a new location for each image in the tag_conf. It returns the
image manifest.

Signed-off-by: Mark Langsdorf mlangsdo@redhat.com

annotations = BUILD_ANNOTATIONS
for worker in worker_annotations:
annotations['worker-builds'][worker] = worker_annotations[worker]
else:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

W291 trailing whitespace

annotations['worker-builds'][worker] = worker_annotations[worker]
else:
annotations['worker-builds']['x86_64'] = X86_ANNOTATIONS

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

W293 blank line contains whitespace

},
}]
tasker, workflow = mock_environment(tmpdir, docker_registry=DOCKER0_REGISTRY,
primary_images=test_images,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

W291 trailing whitespace

@mlangsdorf
Copy link
Contributor Author

This code is most likely wrong, but it's close enough for review.

I suspect that this code:

            digest = worker_digests[0]['digest']
            tag = worker_digests[0]['tag']
            repo = tag.split(':')[0]
            for image in self.workflow.tag_conf.images:
                image_tag = image.to_str(registry=False)

                url = '{}/v2/{}/manifests/{}'.format(registry, repo, digest)

                response = requests.head(url, verify=not insecure, auth=auth)
                response = requests.get(url, verify=not insecure, auth=auth)

should use the image name in place of the digest, but I'm not 100% sure.

I can't get this code to fail if there's an amd64 platform in the annotations

        for platform in all_annotations['worker-builds']:
            if self.goarch.get(platform, platform) == 'amd64':

I suspect that is not the desired behavior, but I don't understand what goarch is for.

Copy link
Contributor

@lcarva lcarva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be going in the right direction.

import requests.auth
import json

try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use six here: from six.moves.urllib.parse import urlparse
It should work for both python 2 and python 3.

Note: We recently added this dependency to atomic-reactor. There are parts of the code that need to be updated to avoid the ImportError try-except approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

grouped_manifests = []

build_result = self.workflow.buildstep_result.get(OrchestrateBuildPlugin.key)
all_annotations = build_result.annotations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also available at: self.workflow.build_result.annotations

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

key = 'group_manifests'
is_allowed_to_fail = False

def __init__(self, tasker, workflow, registries, group=True, goarch={}):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid using mutable objects as default values. Instead, set it to None and create a new instance in method body.

..., goarch=None):
  goarch = goarch or {}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


url = '{}/v2/{}/manifests/{}'.format(registry, repo, digest)

response = requests.head(url, verify=not insecure, auth=auth)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why HEAD?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HEAD should be sufficient for checking that the manifest exists, but it also should have response.raise_for_status() to throw exception on HTTP 404.

Anyway, since get is performed later on, its not needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

url = '{}/v2/{}/manifests/{}'.format(registry, repo, digest)

response = requests.head(url, verify=not insecure, auth=auth)
response = requests.get(url, verify=not insecure, auth=auth)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do this for each iteration in this for-loop? Parameters don't seem to change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


:param tasker: DockerTasker instance
:param workflow: DockerBuildWorkflow instance
:param timeout: int, maximum number of seconds to wait
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outdated params doc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

build_result = self.workflow.buildstep_result.get(OrchestrateBuildPlugin.key)
all_annotations = build_result.annotations
for platform in all_annotations['worker-builds']:
if self.goarch.get(platform, platform) == 'amd64':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of goarch? A way to control arch to platform mapping?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better examples of goarch added to the tests.

if self.goarch.get(platform, platform) == 'amd64':
grouped_manifests = self.get_worker_manifest(all_annotations['worker-builds'][platform]) # noqa
break

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably handle the case where the platform mapping to 'amd64' was not found.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It currently returns an empty list in that case. What else should it do?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this plugin ran with group=False, and the amd64 goarch was not found, it indicates an error IMO. @twaugh, thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agree. Raise an exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

@owtaylor owtaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments - I'm paying attention to this since I'll need to modify this to support OCI images and image indexes (equivalent to manifest lists)

response = requests.get(url, verify=not insecure, auth=auth)

image_manifest = json.loads(response.text)
image_manifest['tag'] = image_tag
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only part of the v1 schema, not part of the v2 schema (or manifest list schema), and in any case, the tag that is applied in the registry is the reference from the URL. If all tags can share the exact same manifest content/digest, that seems preferable.

This software may be modified and distributed under the terms
of the BSD license. See the LICENSE file for details.

create a group of the image manifests from the worker builds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't very clear to me at all. "a group of the image manifest" or "group manifest" aren't - as far as I know - standard things. Is the final intent, something like:

Creates a manifest list that includes the manifests from the worker builds and uploads that to the configured registries

? (I think the group_manifests name for the plugin is also perhaps not the best one!) If the final intent is that if there is only one architecture, it creates a single manifest (essentially retagging the worker build), that should also be indicated here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea behind naming it "group_manifests" is that in the future it may need to group them together in an OCI Image Index, as well as (or instead of) a Docker Manifest List.

It's unfortunate that "group" is a noun as well as a verb. It's intended as a verb in this plugin name: bring together the image manifests in some way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I think of it as a verb, the plugin name seems fine. But this comment needs to be improved to be in line with that and to reference back to concepts (a manifest list) that people reading the code can be expected to know or able to look up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed. possibly.

@owtaylor
Copy link
Contributor

I suspect that this code:

        digest = worker_digests[0]['digest']
        tag = worker_digests[0]['tag']
        repo = tag.split(':')[0]
        for image in self.workflow.tag_conf.images:
            image_tag = image.to_str(registry=False)

            url = '{}/v2/{}/manifests/{}'.format(registry, repo, digest)

            response = requests.head(url, verify=not insecure, auth=auth)
            response = requests.get(url, verify=not insecure, auth=auth)

should use the image name in place of the digest, but I'm not 100% sure.

Why? I think the assumption can be made that the worker pushed the same image for all the tags, and the digest that the worker returned is the reliable way to get exactly that image for that architecture. As @lcarva said, I think the head can be removed and this can be pulled out of the loop.

@twaugh
Copy link
Member

twaugh commented Jul 28, 2017

What goarch is for: manifest lists describe platforms in terms of the strings used by the Go language to describe operating systems (GOOS) and architectures (GOARCH). We're using platform names such as 'x86_64' to say GOOS=linux, GOARCH=amd64. So far, the GOOS is assumed to be 'linux', but to work out the GOARCH we need a translation dictionary to change 'x86_64' into 'amd64'. Hopefully other platform names will be the same as the GOARCH names. FWIW I just submitted a docs update to describe how this translation dictionary will be configured: containerbuildsystem/osbs-docs#31

@mlangsdorf mlangsdorf force-pushed the group-manifests branch 2 times, most recently from c7c3cdd to 1322245 Compare July 28, 2017 15:24
if valid:
return grouped_manifests
else:
raise PluginFailedException('failed to find an x86_64 platform')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception raised here will be wrapped in a PluginFailedException. Let's use something else. Seems like a good case for RuntimeError or ValueError.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

for platform in all_annotations['worker-builds']:
if self.goarch.get(platform, platform) == 'amd64':
valid = True
grouped_manifests = self.get_worker_manifest(all_annotations['worker-builds'][platform]) # noqa
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why # noqa ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line is too long and there's no good place to split it.

Copy link
Contributor

@vrutkovs vrutkovs Jul 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worker_data = all_annotations['worker-builds'][platform]
grouped_manifests = self.get_worker_manifest(worker_data)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

registry = 'https://' + registry

registry_noschema = urlparse(registry).netloc
for push_conf_registry in self.workflow.push_conf.docker_registries:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.workflow.push_conf would always return an empty list on orchestrator, as it is being filled by tag_and_push plugin.

Probably workers should also store the registry they've pushed the image to during the sync

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know delete_from_registry runs on the orchestrator and worker in arrangement 3 and later, and I took the registry handling code from delete_from_registry. But I don't know that the push_conf is the same between worker and orchestrator. I'm going to leave this alone for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete_from_registry is actually a noop in orchestrator build because as @vrutkovs pointed out, push_conf is empty in this case. This is being address by #741.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But no -- group_manifests will create an object in the registry before pulp_sync is called. We need to clean that up too.

msg = 'invalid schema from {}'.format(url)
raise PluginFailedException(msg)

for image in self.workflow.tag_conf.images:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here - this property is being set by tag_and_push. I think the plugin should iterate over worker_digest instead

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's right as it is -- tag_from_config will set these tags.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I've assumed this function will be called on orchestrator - it probably won't, as for manifest list we need just a digest and goarch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tag_from_config only runs on worker builds, so will the workflow.tag_conf on the orchestrator build inherit the tags?

Copy link
Member

@twaugh twaugh Jul 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the next arrangement version we'll be running tag_from_config in both the worker build (for the platform-specific unique tag) and the orchestrator build (for all the other tags).

url = '{}/v2/{}/manifests/{}'.format(registry, repo, digest)
response = requests.get(url, verify=not insecure, auth=auth)

image_manifest = json.loads(response.text)
Copy link
Member

@twaugh twaugh Jul 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use response.json() here. Note that json.loads(response.text) is actually wrong for subtle reasons relating to encoding (the text property will assume the wrong encoding because an RFC tells it to).

@mlangsdorf mlangsdorf force-pushed the group-manifests branch 4 times, most recently from ae3123c to 159059d Compare August 9, 2017 19:06

from six.moves.urllib.parse import urlparse

from atomic_reactor.plugin import PostBuildPlugin, PluginFailedException

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F401 'atomic_reactor.plugin.PluginFailedException' imported but unused

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is now.

from six.moves.urllib.parse import urlparse

from atomic_reactor.plugin import PostBuildPlugin, PluginFailedException
from atomic_reactor.util import Dockercfg, get_manifest_digests, get_config_from_registry

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F401 'atomic_reactor.util.get_config_from_registry' imported but unused

self.log.debug("getting manifests from %s", registry_noschema)
digest = worker_digests[0]['digest']
repo = worker_digests[0]['repository']

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

W293 blank line contains whitespace

manifest_digests = get_manifest_digests(registry_image, registry,
insecure, secret_path)
push_conf_registry.digests[image_tag] = manifest_digests

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

W293 blank line contains whitespace

@mlangsdorf mlangsdorf force-pushed the group-manifests branch 3 times, most recently from 4fe93bc to 78a09ca Compare August 11, 2017 16:07

if registry_image:
push_conf_reg.config = get_config_from_registry(
registry_image, registry_noschema, manifest_digests, insecure, secret_path, 'v2')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E501 line too long (105 > 100 characters)

@mlangsdorf mlangsdorf force-pushed the group-manifests branch 4 times, most recently from 57b2e21 to 9aad1f9 Compare August 15, 2017 10:43
image_tag = image.to_str(registry=False)
url = '{0}/v2/{1}/manifests/{2}'.format(registry, repo, image_tag)
self.log.debug("for image_tag %s, putting at %s", image_tag, url)
response = requests.put(url, image_manifest, **kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should call response.raise_for_status() here I think.


for image in self.workflow.tag_conf.images:
image_tag = image.to_str(registry=False)
url = '{0}/v2/{1}/manifests/{2}'.format(registry, repo, image_tag)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{2} must be only the suffix part, so image.tag I think.

@mlangsdorf mlangsdorf force-pushed the group-manifests branch 2 times, most recently from 0e00cde to 2ae1894 Compare August 16, 2017 17:29
@mlangsdorf
Copy link
Contributor Author

tested with brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/mlangsdo/buildroot:eng-rhel-7-docker-candidate-94693-20170816174925, which is a build of pkgs.devel.redhat.com/rpms/osbs-buildroot-docker@private-mlangsdorf-incremental

that uses:
https://github.com/mlangsdorf/atomic-reactor.git@arrangement4
https://github.com/mlangsdorf/osbs-client.git@arrangement4

build logs are here:
https://osbs.qa.engineering.redhat.com:8443/console/project/osbs-stage/browse/builds/helloworld-container-master-9bc2a/helloworld-container-master-9bc2a-77?tab=logs

please merge thanks.

Copy link
Member

@twaugh twaugh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Added some comments about potential improvements.


from six.moves.urllib.parse import urlparse

from atomic_reactor.plugin import PostBuildPlugin, PluginFailedException
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is now.

Params:
* "secret" optional string - path to the secret, which stores
login and password for remote registry
:param group: bool, if true, return the grouped image manifests from the workers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: better description might be "if true, create a manifest list; otherwise only add tags to amd64 image manifest"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

super(GroupManifestsPlugin, self).__init__(tasker, workflow)
self.group = group
self.goarch = goarch or {}
self.registries = deepcopy(registries)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the copy is for. We don't modify it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.


def run(self):
if self.group:
raise NotImplementedError('group=True is not supported in group_manifest')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: should be "group_manifests"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

if registry in self.worker_registries:
self.worker_registries[registry].append(registry)
else:
self.worker_registries[registry] = [registry]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary if.

registry = digest['registry']
self.worker_registries.setdefault(registry, [])
self.worker_registries[registry].append(registry)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

if not response.ok:
msg = "PUT failed: {0},\n manifest was: {1}".format(response.json(),
image_manifest)
self.log.error(msg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines don't get covered in unit testing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

of the BSD license. See the LICENSE file for details.

get the image manifest lists from the worker builders. If possible, group them together
and return them. if not, return the x86_64/amd64 one after tagging it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use clarification to avoid putting 'manifest' and 'lists' together when not talking about 'manifest list' objects.

Get the image manifest digests from the worker builds and group them together into a manifest list.
When told not to group them (group=False), just tag the amd64 image manifest instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Also slightly clarified the commit message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still says "image manifest lists" which could be ambiguous.

@mlangsdorf mlangsdorf force-pushed the group-manifests branch 2 times, most recently from 9ca020a to 3f9a8eb Compare August 18, 2017 14:07
@twaugh
Copy link
Member

twaugh commented Aug 21, 2017

@mlangsdorf, is this OK to merge? I think it is, just checking.

@vrutkovs
Copy link
Contributor

IIUC #778 supersedes this PR

@twaugh
Copy link
Member

twaugh commented Aug 21, 2017

I'd like to be able to perform testing for arrangement version 4 which requires group_manifests, but does not require group=True to work.

Add a group_manifests plugin. For now, it fails if asked to create
a group of manifests.

If asked to return a single manifest, it pulls the x86_64 image manifest
from a repository passed as an argument. It then re-uploads the image
manifest to a new location for each tag in the tag_conf. It returns the
image manifest.

Signed-off-by: Mark Langsdorf <mlangsdo@redhat.com>
@twaugh twaugh changed the title group_manifests: create plugin for group manifests group_manifests: create minimal plugin for group manifests Aug 22, 2017
@twaugh twaugh merged commit 2bdebf8 into containerbuildsystem:master Aug 22, 2017
@twaugh
Copy link
Member

twaugh commented Aug 22, 2017

Draft release notes updated.

@mlangsdorf mlangsdorf deleted the group-manifests branch August 31, 2017 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants