Data Software Engineering Daily

Sinopsis

Databases and data engineering episodes of Software Engineering Daily

Episodios

  • RudderStack: Open Source Customer Data Infrastructure with Soumyadeb Mitra

    RudderStack: Open Source Customer Data Infrastructure with Soumyadeb Mitra

    20/05/2020 Duración: 47min

    Customer data infrastructure is a type of tool for saving analytics and information about your customers. The company that is best known in this category is Segment, a very popular API company. This customer data is used for making all kinds of decisions around product roadmap, pricing, and design. RudderStack is a company built around The post RudderStack: Open Source Customer Data Infrastructure with Soumyadeb Mitra appeared first on Software Engineering Daily.

  • Matterport 3-D Imaging with Japjit Tulsi

    Matterport 3-D Imaging with Japjit Tulsi

    19/05/2020 Duración: 49min

    Matterport is a company that builds 3-D imaging for the inside of buildings, construction sites, and other locations that require a “digital twin.” Generating digital images of the insides of buildings has a broad spectrum of applications, and there are considerable engineering challenges in building such a system. Matterport’s hardware stack involves a camera built The post Matterport 3-D Imaging with Japjit Tulsi appeared first on Software Engineering Daily.

  • Frontend Performance with Anycart’s Rafael Sanches

    Frontend Performance with Anycart’s Rafael Sanches

    18/05/2020 Duración: 51min

    There are many bad recipe web sites. Every time I navigate to a recipe website, it feels like my browser is filling up with spyware. The page loads slowly, everything seems broken, I can feel the 25 different JavaScript adtech tags interrupting each other. Whether I am searching for banana bread or a spaghetti sauce The post Frontend Performance with Anycart’s Rafael Sanches appeared first on Software Engineering Daily.

  • International Consumer Credit Infrastructure with Brian Regan and Misha Esipov

    International Consumer Credit Infrastructure with Brian Regan and Misha Esipov

    14/05/2020 Duración: 45min

    A credit score is a rating that allows someone to qualify for a line of credit, which could be a loan such as a mortgage, or a credit card. We are assigned a credit score based on a credit history, which could be related to work history, rental payments, or loan repayments.  One problem with The post International Consumer Credit Infrastructure with Brian Regan and Misha Esipov appeared first on Software Engineering Daily.

  • Social Distancing Data with Ryan Fox Squire

    Social Distancing Data with Ryan Fox Squire

    11/05/2020 Duración: 50min

    Social distancing has been imposed across the United States. We are running an experiment unlike anything before it in history, and it is likely to have a lasting impact on human behavior. By looking at location data of how people are moving around today, we can examine the real-world impacts of social distancing. SafeGraph is The post Social Distancing Data with Ryan Fox Squire appeared first on Software Engineering Daily.

  • Dropbox Engineering with Andrew Fong

    Dropbox Engineering with Andrew Fong

    08/05/2020 Duración: 54min

    Dropbox is a consumer storage product with petabytes of data. Dropbox was originally started on the cloud, backed by S3. Once there was a high enough volume of data, Dropbox created its own data centers, designing hardware for the express purpose of storing user files.  Over the last 13 years, Dropbox’s infrastructure has developed hardware, The post Dropbox Engineering with Andrew Fong appeared first on Software Engineering Daily.

  • Pravega: Storage for Streams with Flavio Junquiera

    Pravega: Storage for Streams with Flavio Junquiera

    07/05/2020 Duración: 53min

    “Data stream” is a word that can be used in multiple ways. A stream can refer to data in motion or data at rest.  When a stream is data in motion, an endpoint is receiving new pieces of data on a continual basis. Each new data point is sent over the wire and captured by The post Pravega: Storage for Streams with Flavio Junquiera appeared first on Software Engineering Daily.

  • Advanced Redis with Alvin Richards

    Advanced Redis with Alvin Richards

    06/05/2020 Duración: 53min

    Redis is an in-memory object storage system that is commonly used as a cache for web applications. This core primitive of in-memory object storage has created a larger ecosystem encompassing a broad set of tools. Redis is also used for creating objects such as queues, streams, and probabilistic data structures. Machine learning systems also need The post Advanced Redis with Alvin Richards appeared first on Software Engineering Daily.

  • Multicloud MySQL with Jiten Vaidya and Anthony Yeh

    Multicloud MySQL with Jiten Vaidya and Anthony Yeh

    05/05/2020 Duración: 52min

    For many applications, a transactional MySQL database is the source of truth. To make a MySQL database scale, some developers deploy their database using Vitess, a sharding system built on top of Kubernetes.  Jiten Vaidya and Anthony Yeh work at PlanetScale, a company that focuses on building and supporting MySQL databases sharded with Vitess. Their The post Multicloud MySQL with Jiten Vaidya and Anthony Yeh appeared first on Software Engineering Daily.

  • Data Lakehouse with Michael Armbrust

    Data Lakehouse with Michael Armbrust

    01/05/2020 Duración: 59min

    A data warehouse is a system for performing fast queries on large amounts of data. A data lake is a system for storing high volumes of data in a format that is slow to access. A typical workflow for a data engineer is to pull data sets from this slow data lake storage into the The post Data Lakehouse with Michael Armbrust appeared first on Software Engineering Daily.

  • JAMStack Content Management with Scott Gallant, Jordan Patterson, and Nolan Phillips

    JAMStack Content Management with Scott Gallant, Jordan Patterson, and Nolan Phillips

    30/04/2020 Duración: 55min

    A content management system (CMS) defines how the content on a website is arranged and presented. The most widely used CMS is WordPress, the open source tool that is written in PHP. A large percentage of the web consists of WordPress sites, and WordPress has a huge ecosystem of plugins and templates. Despite the success The post JAMStack Content Management with Scott Gallant, Jordan Patterson, and Nolan Phillips appeared first on Software Engineering Daily.

  • Prefect Dataflow Scheduler with Jeremiah Lowin

    Prefect Dataflow Scheduler with Jeremiah Lowin

    29/04/2020 Duración: 01h04min

    A data workflow scheduler is a tool used for connecting multiple systems together in order to build pipelines for processing data. A data pipeline might include a Hadoop task for ETL, a Spark task for stream processing, and a TensorFlow task to train a machine learning model.  The workflow scheduler manages the tasks in that The post Prefect Dataflow Scheduler with Jeremiah Lowin appeared first on Software Engineering Daily.

  • CockroachDB with Peter Mattis

    CockroachDB with Peter Mattis

    28/04/2020 Duración: 56min

    A relational database often holds critical operational data for a company, including user names and financial information. Since this data is so important, a relational database must be architected to avoid data loss. Relational databases need to be a distributed system in order to provide the fault tolerance necessary for production use cases. If a The post CockroachDB with Peter Mattis appeared first on Software Engineering Daily.

  • Dask: Scalable Python with Matthew Rocklin

    Dask: Scalable Python with Matthew Rocklin

    27/04/2020 Duración: 01h01min

    Python is the most widely used language for data science, and there are several libraries that are commonly used by Python data scientists including Numpy, Pandas, and scikit-learn. These libraries improve the user experience of a Python data scientist by giving them access to high level APIs. Data science is often performed over huge datasets, The post Dask: Scalable Python with Matthew Rocklin appeared first on Software Engineering Daily.

  • NGINX API Management with Kevin Jones

    NGINX API Management with Kevin Jones

    22/04/2020 Duración: 53min

    NGINX is a web server that can be used to manage the APIs across an organization. Managing these APIs involves deciding on the routing and load balancing across the servers which host them. If the traffic of a website suddenly spikes, the website needs to spin up new replica servers and update the API gateway The post NGINX API Management with Kevin Jones appeared first on Software Engineering Daily.

  • NGINX Service Mesh with Alan Murphy

    NGINX Service Mesh with Alan Murphy

    16/04/2020 Duración: 59min

    NGINX is a web server that is used as a load balancer, an API gateway, a reverse proxy, and other purposes. Core application servers such as Ruby on Rails are often supported by NGINX, which handles routing the user requests between the different application server instances.  This model of routing and load balancing between different The post NGINX Service Mesh with Alan Murphy appeared first on Software Engineering Daily.

  • Ceph Storage System with Sage Weil

    Ceph Storage System with Sage Weil

    14/04/2020 Duración: 54min

    Ceph is a storage system that can be used for provisioning object storage, block storage, and file storage. These storage primitives can be used as the underlying medium for databases, queueing systems, and bucket storage. Ceph is used in circumstances where the developer may not want to use public cloud resources like Amazon S3. As The post Ceph Storage System with Sage Weil appeared first on Software Engineering Daily.

  • Collaborative SQL with Rahil Sondhi

    Collaborative SQL with Rahil Sondhi

    13/04/2020 Duración: 48min

    Data analysts need to collaborate with each other in the same way that software engineers do. They also need a high quality development environment.  These data analysts are not working with programming languages like Java and Python, so they are not using an IDE such as Eclipse. Data analysts predominantly use SQL, and the tooling The post Collaborative SQL with Rahil Sondhi appeared first on Software Engineering Daily.

  • Cadence: Uber’s Workflow Engine with Maxim Fateev

    Cadence: Uber’s Workflow Engine with Maxim Fateev

    08/04/2020 Duración: 56min

    A workflow is an application that involves more than just a simple request/response communication. For example, consider a session of a user taking a ride in an Uber. The user initiates the ride, and the ride might last for an hour. At the end of the ride, the user is charged for the ride and The post Cadence: Uber’s Workflow Engine with Maxim Fateev appeared first on Software Engineering Daily.

  • kSQLDB: Kafka Streaming Interface with Michael Drogalis

    kSQLDB: Kafka Streaming Interface with Michael Drogalis

    07/04/2020 Duración: 48min

    Kafka is a distributed stream processing system that is commonly used for storing large volumes of append-only event data. Kafka has been open source for almost a decade, and as the project has matured, it has been used for new kinds of applications.  Kafka’s pubsub interface for writing and reading topics is not ideal for The post kSQLDB: Kafka Streaming Interface with Michael Drogalis appeared first on Software Engineering Daily.

página 2 de 5

Informações: