[Gluster-devel] Reducing the size of the glusterdocs git repository

Nigel Babu nigelb at redhat.com
Tue May 17 12:32:00 UTC 2016


We could potentially setup travis-ci to do builds that'll fail loudly if we
commit something that throws a warning. I've tried out the possibility here:

https://travis-ci.org/nigelbabu/glusterdocs/jobs/130816121

I've purposefully made it fail. Success looks like this:

https://travis-ci.org/nigelbabu/glusterdocs/jobs/130815368

We can, in the future, add stuff so that documentation has working links
and there are no large files checked in. If there's interest happy to send
a pull request for this.

On Tue, May 17, 2016 at 4:55 PM, Amye Scavarda <amye at redhat.com> wrote:

>
> On Tue, May 17, 2016 at 3:59 PM, Amye Scavarda <amye at redhat.com> wrote:
>
>>
>>
>> On Tue, May 17, 2016 at 3:56 PM, Niels de Vos <ndevos at redhat.com> wrote:
>>
>>> On Tue, May 17, 2016 at 02:42:27PM +0530, Amye Scavarda wrote:
>>> > Hi all,
>>> >
>>> > So we have a new slideshare.net account, GlusterCommunity (
>>> > http://www.slideshare.net/GlusterCommunity/) that connects with the
>>> > Gluster.org G+ community - and it'll even connect with the YouTube
>>> channel!
>>> >
>>> > I've submitted a PR to the glusterdocs repo that will need some
>>> review: it
>>> > removes all of the presentations from the repo and links to
>>> slideshare. (
>>> > https://github.com/gluster/glusterdocs/pull/109)
>>>
>>> Cool, but note that the size of the repository does not decrease with
>>> that commit. The git repository will still contain all the presentations
>>> in the history/log. But not adding any more presentations is a good step
>>> already :-)
>>>
>>> You are correct, but it will not make the current issue worse. It would
>> help if I actually hit 'reply all'.
>>
>>
>>> > In no way does this mean that anyone needs to use Slideshare to host
>>> PDFs
>>> > of slides, you can use whatever you want. I chose slideshare because
>>> there
>>> > was an older Gluster account that had some Gluster.com presentations
>>> and it
>>> > links with YouTube.
>>> >
>>> > Thoughts?
>>>
>>> Looks good to me, but maybe you can address this comment in the GitHub
>>> pull request:
>>>   https://github.com/gluster/glusterdocs/pull/109/files#r63498585
>>>
>>> That's why I have you all to proofread.
>>
>>
> One thing I'm noticing, we don't have any sort of CI on Read The Docs. Let
> me see if there's not an easy way to fix that and have TravisCI tell us if
> we're about to merge something with a bunch of borked links.
> -- a
>
>
>>  - amye
>>
>>
>>> Thanks,
>>> Niels
>>>
>>> > - amye
>>> >
>>> >
>>> >
>>> > On Thu, May 12, 2016 at 7:49 PM, Niels de Vos <ndevos at redhat.com>
>>> wrote:
>>> >
>>> > > On Thu, May 12, 2016 at 03:55:23PM +0530, Kaushal M wrote:
>>> > > > On Thu, May 12, 2016 at 1:25 PM, Niels de Vos <ndevos at redhat.com>
>>> wrote:
>>> > > > > On Thu, May 12, 2016 at 02:56:52AM -0400, Prashanth Pai wrote:
>>> > > > >>
>>> > > > >>
>>> > > > >> > > Right now, even cloning the main docs branch is a huge pain
>>> due
>>> > > to the size
>>> > > > >> > > of the repo.
>>> > > > >> > > I think that branching will solve not this problem, and
>>> might
>>> > > make the
>>> > > > >> > > problem worse.
>>> > > > >> >
>>> > > > >> > Branching would not increase the size of the repository
>>> itself.
>>> > > Only the
>>> > > > >> > size used on RTD will be bigger as the HTML for different
>>> branches
>>> > > will
>>> > > > >> > be generated (so contents is there 2x). Cloning the
>>> repository is
>>> > > not
>>> > > > >> > affected.
>>> > > > >> >
>>> > > > >> > Deleting files (like the presentations) will also not remove
>>> them
>>> > > from
>>> > > > >> > the git repository. It will stay possible to checkout an older
>>> > > version
>>> > > > >> > of the docs from the same repository, all of the history is
>>> > > downloaded
>>> > > > >> > once the repository is cloned.
>>> > > > >> >
>>> > > > >> > In order to reduce the size of the repository, you need to
>>> create a
>>> > > new
>>> > > > >> > one, and import the changes without the big files. While
>>> importing
>>> > > > >> > changes from an other (the current) repository, it is
>>> possible to
>>> > > modify
>>> > > > >> > the changes on the fly and prevent importing the big files.
>>> This
>>> > > keeps
>>> > > > >> > the history and the credits for the contributors.
>>> > > > >>
>>> > > > >> This is an alternative solution:
>>> > > > >> https://rtyley.github.io/bfg-repo-cleaner/
>>> > > > >
>>> > > > > Right, I was thinking about git-filter-branch. In the end, I am
>>> pretty
>>> > > > > sure that the old/original repository is not valid anymore. I
>>> expect
>>> > > > > that 'git rebase' is used for the cleaning, and that will change
>>> the
>>> > > > > commit-ids of patches that follow after a 'cleaned' patch.
>>> > > > >
>>> > > > > Mu recommendation for a seperate repository, is only for
>>> preventing
>>> > > > > inconsistencies between the upstream repository (after cleaning)
>>> and
>>> > > the
>>> > > > > previously cloned/forked repositories that contributors have.
>>> > > > >
>>> > > > >> > Where would you suggest the presentations (and other files?)
>>> should
>>> > > get
>>> > > > >> > located?
>>> > > > >>
>>> > > > >> May be an official Gluster community slideshare or speakerdeck
>>> > > account ?
>>> > > > >
>>> > > > > Possibly something like this. But we should have a plan for the
>>> > > existing
>>> > > > > presentations too. And we have to accept that not everyone
>>> presenting
>>> > > > > about a Gluster (related) topic will use 'our' SaaS instance.
>>> > > > >
>>> > > > >> Git LFS is also also an option but we don't really need
>>> versioning for
>>> > > > >> presentation files. Git LFS will keep large files in a separate
>>> > > location
>>> > > > >> and keep a "pointer" to those in the repo.
>>> > > > >
>>> > > > > I'd prefer something like this. Most of my presentations are
>>> written
>>> > > > > while I'm travelling, so a connected service is not really an
>>> option
>>> > > for
>>> > > > > me in any case.
>>> > > >
>>> > > > The docs repo should just have links to the presentations.
>>> > > > They could be hosted on slideshare/speakerdeck, google drive or
>>> they
>>> > > > could be hosted html5 presentations.
>>> > > > If required we could just host the presentations on
>>> download.gluster.org
>>> > > .
>>> > > > I've seen it being used to host resources for tutorials previously
>>> > > > (like disk images),
>>> > > > so hosting the actual presentations shouldn't be too hard.
>>> > >
>>> > > I really do not care where they are hosted. We just can not demand
>>> the
>>> > > use of a SaaS for them. We can offer the option of course, but still
>>> > > allow presenters to use the tool of their preference.
>>> > >
>>> > > Niels
>>> > >
>>> > > _______________________________________________
>>> > > Gluster-devel mailing list
>>> > > Gluster-devel at gluster.org
>>> > > http://www.gluster.org/mailman/listinfo/gluster-devel
>>> > >
>>> >
>>> >
>>> >
>>> > --
>>> > Amye Scavarda | amye at redhat.com | Gluster Community Lead
>>>
>>
>>
>>
>> --
>> Amye Scavarda | amye at redhat.com | Gluster Community Lead
>>
>
>
>
> --
> Amye Scavarda | amye at redhat.com | Gluster Community Lead
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20160517/78c3e2ac/attachment.html>


More information about the Gluster-devel mailing list