• 1 Post
  • 46 Comments
Joined 2 years ago
cake
Cake day: February 16th, 2024

help-circle

  • Traditional way is to just use a WordPress account, and then move onto a paid hosting service of you decide you like keeping up with your blog. No point in paying for something you don’t use. Their ceo was a dick with open source stuff, but the website itself is still solid enough to be used to check if its a hobby you want to actually keep up with.

    If you want to spend just as much time managing the blog as you do actually sharing things, a raspberry pi, Hugo, nginx, and a lot of time are also an option.

    I personally use Porkbun for the .com and hostinger for the backend, and it’s been great for the past couple years to host my own wordpress setup.

    But actually, I think that makes me oldschool. The new kids are using neocities.



  • Yeah, and on the smaller / earlier side of a theoretical search engine company, google offers their api for free. I think this is actually another one of the biggest contributors to why nobody has tried to make a new search engine with their own index. Why waste hundreds of thousands of dollars in hardware, and even more on personnel costs, when you can just have google do it for you instead?


  • When I said ‘direct expenses’ I mostly meant the cost of owning / running a database of internet pages and metadata comprehensive enough to be considered part of a ‘fully featured search engine’. There’s also the other half; the compute required to create that metadata, as well as obtain it, but at most I would guess that those would be equal in cost to just having the space for a database of all the internet pages (scaling up after that based on how many users you need to support). In short, a scaled down web engine that had access to every page on the internet that people would want to find could cost as low as 100,000$ for a first time purchase for the hardware.

    The internet archive does in fact have their own web crawler they use. They also do sites upon request as well; i’ve had my personal website on there for almost two decades now, specifically at my request.

    They also have a full-featured search function available for anyone on their website at archive.org. This is why I say they’re a reasonable price comparison for a full-featured search engine. They may spend more on storage and less on metadata compute than a theoretical smaller search engine, but at the end of the day, that’s just a re-balancing of the cost, not a completely new and more excessive cost.

    I think direct expenses; the cost of owning and maintaining an internet index database, are definitely significant enough that the completely free access that google gives to anyone who wants it, are way more than any single private entity or company is able to support just because they want to have it. I don’t think it would be anywhere even close to a billion dollars though.

    I think the hardest part of having a internet index database would be the knowledge required to create and maintain it, especially under the hostile forces that are the 75 billion dollar seo industry. If a selfhosted search engine became big enough that the seo industry started trying to break it, I don’t think that company would survive for very long at all.

    Google is losing that battle, like, almost completely. What hope would a small startup style company have of battling it and staying financially solvent, especially if they’re trying to be different from google and bing and actually showing results without the pressure of advertisers breathing down their necks?

    I think the hardware side of a search engine is solvable with silicon valley startup level of funding. I think it’s impossible for anyone in the current day and age to make that sort of project solvent while keeping the user (instead of the advertiser) as the main customer. For anyone else who can’t get those funds, or don’t actually want to do a results-oriented search engine, they can just mooch of off google and bing for free.


  • Size isn’t everything, so the real question is: what search site uses only the common crawl index and has results on par with bing or google?

    None of them. At least, none that I’m aware of. I just don’t think that direct expenses are the reason that there are are only two major web search tools. I also don’t think Google and bing are good examples to point at when estimating the cost of running a complete search engine.

    If you read all of your article, the author notes that while Google has index of about 400 billion, the internet archives index is actually bigger at around 865 billion.

    The internet archive has an operating cost of about 33m/year. I think that is a much more reasonable example to point to and say “running a complete search engine would have a similar price as that”.

    Also, very neat article btw. I would have never guessed that googles search index count has been shrinking for the past little bit. Or that Google actively culls results from their database that it thinks people won’t ever want to see.


  • I think most startup search engines use Google/bing because it’s free/way cheaper than running their own database, not because it’s impossible. It also likely sidesteps a lot of the seo bullshit simply because Google/bing have more experience working around it

    So like, short term/small size its cheaper and straight up easier to piggyback off of the big two companies, rather than manage your own data set. Long term, if you get popular enough to be noticed, I expect that the seo business would wreck any selfhosting search engine startup company’s results pretty regularly.


  • That’s like saying that it’s impossible to run a car manufacturing company without 100 billion because that’s how much Ford spends on their car manufacturing processes. It makes no sense.

    Yes, making an original search engine is hard, just like making trucks is. But that doesn’t mean that running either one requires billions of dollars to do.

    Common crawl is a nonprofit that regularly shares free copies of every internet page with metadata, and it damn well doesn’t take billions to do it either. https://commoncrawl.org/


  • The issue is that the internet is too large to index.

    It’s really not. At least, not yet. It’s a large part of why it isn’t done, but it’s not the only one, and I’d argue, not even the main reason it isn’t really done.

    A complete crawl with meta data of the internet in 2025 is only 424TiB. For comparison, my 1000$ home setup can handle about a tenth of that(in storage at least). The hardware to maintain a single database of the internet with metadata could cost under $100,000, easily.

    Dave, your comment about it costing a billion to run Bing or Google might be true, but it is completely unrelated to the realities of running a small search engine and has everything to do with the fact that they are Google and Microsoft products respectively.

    The real issue isn’t the physical size of the internet, it’s much more likely to be the complexity of making a search algorithm that can compete with the 75 billion seo market that wxists to break search engines.


  • Not OP, but I left for similar reasons. The CEO publically supported the Republican admin (mildly, but even at the time, stupidly). The statement sent out about it after the fact was also sus, but not really super bad.

    I left anyway. I’d rather not pay a CEO to publically support the administration that is specifically targeting my family for political points.

    I also heard a lot of fear mongering on the fediverse about how their new AI conversations can’t be private because it gets to their servers directly, but I couldn’t find anyone reasonable online who actually looked into it and confirmed that.

    So like, they’ve got all the ingredients for more stupidity, and as we’ve seen time and again, everything pressuring them to fuck up/enshitify is also there in the background too.



  • For what it’s worth; your Razer issues were likely not a random thing. Razer has a rough history of bad quality control and even worse customer support. The one thing they’re still good at is marketing. https://m.youtube.com/watch?v=KhfqhCxqpQ8

    I’ve also heard good things about the frameworks laptops, but I’ve not personally used any of them. At least with a frameworks laptop you can do upgrades later, rather than having to buy a completely new device whenever a part breaks or you have it for longer than three years. Easily available battery replacements alone make that a good deal.







  • DaGeek247@fedia.iotoAsk Lemmy@lemmy.world*Permanently Deleted*
    link
    fedilink
    arrow-up
    7
    arrow-down
    1
    ·
    3 months ago

    About six weeks. I was attached to someone else’s unit at NTC in California for a training excersize with them. There were no showers in the field, and the showers pre and post excersize were colder than a witches tit, and open as a gay mans asshole after all night orgy.

    And that wasn’t the worst part of the whole experience either.