Closed Bug 986112 Opened 10 years ago Closed 10 years ago

Make blobber uploads discoverable

Categories

(Release Engineering :: General, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: armenzg, Assigned: armenzg)

References

(Blocks 1 open bug)

Details

Attachments

(4 files, 8 obsolete files)

7.66 KB, patch
catlee
: review+
armenzg
: checked-in+
Details | Diff | Splinter Review
488 bytes, patch
catlee
: review+
armenzg
: checked-in+
Details | Diff | Splinter Review
2.79 KB, text/plain
Details
3.67 KB, patch
mozilla
: review+
armenzg
: checked-in+
Details | Diff | Splinter Review
      No description provided.
Summary: Make blobber uploads discoverables → Make blobber uploads discoverable
Assignee: nobody → armenzg
what does this mean?
Current context is this:
* filename and URL are tinderboxprinted and that can only be determine post-log analysis

e.g.
https://tbpl.mozilla.org/php/getParsedLog.php?id=36463088&tree=Cedar&full=1
13:38:14     INFO -  (blobuploader) - INFO - TinderboxPrint: Uploaded 6165981a-b86f-40ad-990c-c8c05006c547.dmp to http://mozilla-releng-blobs.s3.amazonaws.com/blobs/cedar/sha512/cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e
13:38:19     INFO -  (blobuploader) - INFO - TinderboxPrint: Uploaded mozilla-test-fail-screenshot__crs3c.png to http://mozilla-releng-blobs.s3.amazonaws.com/blobs/cedar/sha512/7a90fda7a81e88eff74dcfbde10c68cb6965b38e3f4124c5c06b09e9deb6407227b13f923544e851ab9dea7a1eb996bf2f2163f26e4f6f9c2b554ec404b7c921

IIUC the intent is to upload structured logs (upcoming) to blobber and be able to read that instead of the logs.
I'm thinking of adding the info to the associated json file for each build; what do you think?
http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-win32/1395309723/firefox-31.0a1.en-US.win32.json

jgriffin, I hope I understood this well.
Yes, we'd like to have a method of accessing files uploaded by test runs without having to parse the entire raw log to find the url's.  The primary immediate use case for this is structured logs that we're starting to produce in automation; we'd like Treeherder to be able to consume these without the overhead of parsing the entire raw log first.

Armen, I think your solution of adding a json file for each build which points to the blobber uploads for that build is perfect!
My use case for this as follows:

I'm trying to use try to collect code-coverage results, which means I need to upload packs of files containing the coverage results from try builds. I'm using blobber for this now, but this requires unpacking the logs: <https://github.com/jcranmer/m-c-tools-code-coverage/blob/master/collect-try-results.py#L203>. I've been discovering the interesting log files for now by listing FTP directories recursively.
We have no easy fix in here.

Uploading to ftp from a test machine (no keys on-disk to upload to ftp) is not an option (this is why blobber exists to begin with).
With that being said, what I say in comment 2 about the current .json makes no sense. The file is uploaded by the build machines.

I can think of these options.

Option A:
* the test job uploads a file to blobber which has the information about any other files that have been uploaded with blobber (this file must be the last one)
* a property is set called "uploaded_artifacts_metadata_url" (or similar) which is the special file
* we read from builds 'builds-YYYY-MM-DD.js.gz' [1] the property "uploaded_artifacts_metadata_url"

Option B:
Another approach is to make blobber predictable instead of just using "$branch" + sha512. I can't think of a clean approach in here.

Option C:
We have an API to answer all these questions. This nevertheless would be based on option A.

I think option A is easily doable.

[1] https://secure.pub.build.mozilla.org/builddata/buildjson/builds-2014-04-23.js.gz
Option A sounds pretty sane. The trick is just to bootstrap something that gives you a way to get the blobber URLs for files.
Do we have a testing environment that I can use?
Or should I set the blobber server locally?

I have not been able to find documentation besides the one on the github repo.
The documentation for blobber should be on mana.
@Armen: if you need a hand on fixing this bug, let me know, I wanna help. Also, should you need more info on how to run blobber env locally, I just wrote a blog-post: http://mtabara.net/2014/05/17/blobber-local-environment-setup/ . As to mana docs, I no longer have LDAP credentials so I can't get in. But you can ask Rail, he's in charge there :-)
Flags: needinfo?(armenzg)
It will take a bit to get back to this, however, I'm OK for now. Thanks!
Flags: needinfo?(armenzg)
I will be looking into this again this week.
Attached patch [wip] blobuploader.diff (obsolete) — Splinter Review
This is a wip that can generate this:
http://mozilla-releng-blobs.s3.amazonaws.com/blobs/ash/sha512/73bf56a7c5ebef280e5d4d2119acc92d52305918733dac50a72019e1534b255baf44f7a5b82a11f12c6eb0f56d2c0c3c8311cc3ad85c4ce8102644d7c9490c9d
{"file.txt": "http://mozilla-releng-blobs.s3.amazonaws.com/blobs/ash/sha512/eb6bfd31d90493d6a60f9eb7f576e9913a9607c5b8387ecba8935ffa1560ea7913b4a228534f3c25fd3ba955fe4f80459590adb529db95daca2c4bfc429bcdf7"}

This is how it works:
1) blobuploader iterates through the files that need to be uploaded
2) we build a dictionary with the keys being the filenames and the value being the blobber url
3) once we're done going through all the files we generate a json file which we upload to blobber
4) we create a file in the "properties" dir that points to the file uploaded with info about all other files

I still need to work on accomplishing #4. I'm thinking of passing this info with another parameter to blobberc.py
Attachment #8441428 - Flags: feedback?(catlee)
Attachment #8441428 - Attachment is obsolete: true
Attachment #8441428 - Flags: feedback?(catlee)
Attachment #8441549 - Flags: review?(catlee)
Assuming the previous patch is approved.
Attachment #8441552 - Flags: review?(aki)
Comment on attachment 8441552 [details] [diff] [review]
create properties dir if not existent, request blobberrc.py to write a properties-summary-file

Hm.  We already have a properties dir for buildbot properties.
Did you intend to have these become buildbot properties, or keep them separate?

If the latter, rename to blobber-properties.
If the former, I *think* the properties dir is one level up (in base_work_dir, not in abs_work_dir) and we should go through set_buildbot_property(). http://hg.mozilla.org/build/mozharness/file/316033a17e14/mozharness/mozilla/buildbot.py#l96
Comment on attachment 8441552 [details] [diff] [review]
create properties dir if not existent, request blobberrc.py to write a properties-summary-file

Clearing review until response to previous comment.
Attachment #8441552 - Flags: review?(aki)
I want them to become buildbot properties.
I will use set_buildbot_property().
I got the path to properties wrong as you pointed out.

Here's the job working in Ash (I will have to re-test after I attend to aki's comment):
https://tbpl.mozilla.org/?tree=Ash&jobname=Rev4 MacOSX Snow Leopard 10.6 ash debug test mochitest-browser-chrome-1&rev=dc74007d4a19
> (blobuploader) - INFO - TinderboxPrint: uploaded_files.txt: uploaded
which has the information of the other two files uploaded:
http://mozilla-releng-blobs.s3.amazonaws.com/blobs/ash/sha512/a1b8b2d758c33a4480eee90934a37d09dae9d9a596c7486694835862ba73e262b3fee85ab7d7ccf08592f83bd09d3ae292541c7f667ee904f08c0530aec24ad7
Attachment #8441552 - Attachment is obsolete: true
Attachment #8442252 - Flags: review?(aki)
Adjusting some comments and slight change.
Attachment #8441549 - Attachment is obsolete: true
Attachment #8441549 - Flags: review?(catlee)
Attachment #8442255 - Flags: review?(catlee)
Comment on attachment 8442252 [details] [diff] [review]
request blobberrc.py to write a properties-summary-file

Can you update the BlobUploadMixin docstring to mention that there is a dependency on BuildbotMixin ?
Attachment #8442252 - Flags: review?(aki) → review+
Comment on attachment 8442255 [details] [diff] [review]
generate a summary file and create a property file pointing to it

Review of attachment 8442255 [details] [diff] [review]:
-----------------------------------------------------------------

As discussed on IRC, let's change the mozharness / blobber integration slightly.

mozharness chooses a filename where it wants blobber to write the url to the upload manifest. This shouldn't be inside the properties directory. It passes that filename to blobberc.py via --output-manifest-url (or similar).

blobberc.py creates a manifest of all the files its uploaded, and uploads the manifest like you're doing. It then writes the URL to the manifest into the file specified by --output-manifest-url.

mozharness then reads that file after blobberc.py is done, and sets the appropriate buildbot property.

I think this is cleaner over all, but will require some more changes here, and to your mozharness work. Sorry!

::: blobberc.py
@@ +4,5 @@
> +-u, --url URL                     URL to blobber server to upload to.
> +-a, --auth AUTH_FILE              user/pass AUTH_FILE for signing the calls
> +-b, --branch BRANCH               Specify branch for the file (e.g. try, mozilla-central)
> +-p, --property-summary-file FILE  Track all uploaded files and create a file with a URL
> +                                  to such uploaded file.

let's call this --output-manifest-url URL_FILE

@@ +85,4 @@
>               os.path.isfile(os.path.join(dirname, f)) and
>               not os.path.islink(os.path.join(dirname, f))]
>  
> +    upload_meta_data = {}

let's call this upload_manifest

@@ +93,5 @@
> +        blob_url = upload_file(hosts, filename, branch, auth, compress=compress_file,
> +                               allowed=allowed_to_send(filename, filetype_whitelist))
> +        upload_meta_data[f] = blob_url
> +
> +    if upload_meta_data != {} and property_summary_file != None:

This could be written as

    if upload_meta_data and property_summary_file:

non-empty dicts evaluate as True, as do non-empty strings

@@ +100,5 @@
> +        log.info("Here are the contents: %s" % json.dumps(upload_meta_data))
> +        # 1) Write to disk (side-by-side with all other files)
> +        uploaded_files_filepath = os.path.join(dirname, 'uploaded_files.txt')
> +        with open(uploaded_files_filepath, 'w') as outfile:
> +            json.dump(upload_meta_data, outfile)

This should be called uploaded_files.json. It could be in a temporary file too...I'm not sure if it's good or bad that this lives in the upload dir...It would mean that if you re-ran blobberc.py against the same directory, you'd upload the _old_ uploaded_files.json, and then create and upload a new one.

@@ +107,5 @@
> +                compress=compress_file, allowed=allowed_to_send(filename, filetype_whitelist))
> +        # 3) Record summary of all uploaded files
> +        #    Enforce the filename and the property name to match
> +        #    Write the information in buildbot properties style
> +        f = open(property_summary_file, 'w')

use a with statement here.

with open(manifest_url_file, 'w') as f:
   f.write(blob_url)

you may or may not need to add a newline...depends how mozharness handles writing these properties to disk after.
Attachment #8442255 - Flags: review?(catlee) → review-
Did I get it right?
Attachment #8442255 - Attachment is obsolete: true
Attachment #8442904 - Flags: review?(catlee)
I will ask for review once the blobuploader patch gets reviewed first.
Attachment #8442252 - Attachment is obsolete: true
Attachment #8442909 - Flags: review?(catlee)
w/o typo.
Attachment #8442907 - Attachment is obsolete: true
I'm calling uploaded_files.txt instead of uploaded_files.json until the blobber patch is reviewed and deployed.
Comment on attachment 8442904 [details] [diff] [review]
[blobuploader] generate and upload a manifest file, write the URL to a temp file

Review of attachment 8442904 [details] [diff] [review]:
-----------------------------------------------------------------

::: blobberc.py
@@ +85,4 @@
>               os.path.isfile(os.path.join(dirname, f)) and
>               not os.path.islink(os.path.join(dirname, f))]
>  
> +    upload_manifests = {}

nit: upload_manifest instead of upload_manifests
Attachment #8442904 - Flags: review?(catlee) → review+
Comment on attachment 8442909 [details] [diff] [review]
[blobber] allow json files

Review of attachment 8442909 [details] [diff] [review]:
-----------------------------------------------------------------

::: blobber/config.py
@@ +32,4 @@
>      'png': 'image/png',
>      'jpeg': 'image/jpg',
>      'jpg': 'image/jpg',
> +    'json': 'text/plain',

should this be application/json instead? either way is fine I think.
Attachment #8442909 - Flags: review?(catlee) → review+
Attachment #8442911 - Flags: review?(aki)
Attachment #8442911 - Flags: review?(aki) → review+
ping wrt to pull requests.
Flags: needinfo?(tabara.mihai)
Flags: needinfo?(catlee)
@Armen: Will look into this & merge - tomorrow morning first thing.
Been out of town, didn't have laptop, sorry for delay.
Flags: needinfo?(tabara.mihai)
Merged both.
Blobuploader 1.2.0 version lies https://pypi.python.org/pypi/blobuploader
Blobber (server) is up-to-date on github. Rail has a nice quick script to re-deploy this on Amazon.
Thank you Mihai!
Flags: needinfo?(catlee)
Depends on: 1029598
Depends on: 1029603
Now that blobber has been deployed.
I will have to retest and use a .json file instead of a .txt file.
Attachment #8442904 - Flags: checked-in+
Attachment #8442909 - Flags: checked-in+
Comment on attachment 8442911 [details] [diff] [review]
[mozharness] request blobberrc.py to write a URL to the manifest in a temp file

https://hg.mozilla.org/build/mozharness/rev/e095aa215e3c
Attachment #8442911 - Flags: checked-in+
Comment on attachment 8442911 [details] [diff] [review]
[mozharness] request blobberrc.py to write a URL to the manifest in a temp file

Backing out: "test fail on test_mozilla_blob_upload.TestBlobUploadMechanism.test_blob_upload_mechanism"
https://hg.mozilla.org/build/mozharness/rev/c49fa160c924

I will look into it tomorrow.
Attachment #8442911 - Flags: checked-in+ → checked-in-
This is a slightly modified version that would allow us to pass the test_mozilla_blob_upload.py (error on attachment 8445493 [details]).

The way to fix it is to have a defined place for the manifest file rather than a temp file.

I will be testing this on Ash.
Attachment #8442911 - Attachment is obsolete: true
Merged your patch on github (released 1.2.1 ). See https://pypi.python.org/pypi/blobuploader

Bumped its version on mozharness.
Attachment #8445986 - Flags: review?(armenzg)
Comment on attachment 8445986 [details] [diff] [review]
Bump blobuploader version to 1.2.1.

Thanks for the review but I'm still testing the patch.
See attachment 8445967 [details] [diff] [review].
Attachment #8445986 - Flags: review?(armenzg)
FTR this is the patch that got reviewed and pulled:
https://github.com/MihaiTabara/blobuploader/pull/15

The change is simple:
uploaded_files.txt -> uploaded_files.json
Comment on attachment 8445967 [details] [diff] [review]
[mozharness] request blobberrc.py to write a URL to the manifest in a manifest file

It works on ash-mozharness and passes the test.
Attachment #8445967 - Flags: review?(aki)
Attachment #8445986 - Attachment is obsolete: true
Comment on attachment 8445967 [details] [diff] [review]
[mozharness] request blobberrc.py to write a URL to the manifest in a manifest file

>+            # if blobberc writes anything into the manifest file then it means that a manifest file has been uploaded
>+            if os.path.getsize(manifest_path) > 0:
>+                f = open(manifest_path, 'r')
>+                blobber_manifest_url = f.read()
>+                f.close()

self.read_from_file() ?

>+                self.set_buildbot_property(prop_name="blobber_manifest_url",
>+                        prop_value=blobber_manifest_url, write_to_file=True)
>+            os.remove(manifest_path)

self.rmtree()
Attachment #8445967 - Flags: review?(aki) → review+
Depends on: 1030737
Comment on attachment 8445967 [details] [diff] [review]
[mozharness] request blobberrc.py to write a URL to the manifest in a manifest file

Thanks for those useful functions!

I've pushed it to default with 1.2.0 and testing one last time on Cypress:
https://tbpl.mozilla.org/?tree=Cypress

Meanwhile, I've requested the newer blobuploader to be uploaded (.json file instead of .txt).
I will bump the version once uploaded.
Attachment #8445967 - Flags: checked-in+
Once bug 1030737 is deployed we can go ahead and bump the version.
Merged to production, and deployed.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Depends on: 1048270
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: